# Developer Guide


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Project Structure

The bs-cpg package uses [nbdev](https://nbdev.fast.ai/) for development,
which means the source code and documentation are authored directly in
Jupyter notebooks and then compiled into Python modules.

### Key Directories

- `nbs/`: Contains the Jupyter notebooks where all code and
  documentation are written
  - `index.ipynb`: Main documentation homepage
  - `00_download_processed_geo.ipynb`: GEO data acquisition module
  - `01_setup.ipynb`: Configuration and data path management
  - `02_download_ref.ipynb`: Reference genome and chain file utilities
  - `03_liftover_pos.ipynb`: Genomic coordinate processing and liftover
  - `04_developers.ipynb`: This file
- `bs_cpg/`: Auto-generated Python package (exported from notebooks)
- `_proc/`: Processing notebooks (not part of the main package)
- `data/`: Sample data files for testing

### Configuration Files

- `nbdev.yml`: nbdev configuration
- `settings.ini`: Project metadata and package settings
- `pyproject.toml`: Python project configuration
- `_quarto.yml`: Quarto/Jupyter Book configuration for documentation

## Setting Up Development Environment

### 1. Clone the Repository

``` bash
git clone https://github.com/magistak/bs-cpg.git
cd bs-cpg
```

### 2. Install in Development Mode

Install the package with editable mode so changes are reflected
immediately:

``` bash
pip install -e .
```

This installs the package in editable/development mode, meaning the code
is linked rather than copied.

### 3. Install Development Dependencies

``` bash
pip install nbdev jupyter
```

### 4. Verify Installation

``` bash
python -c "import bs_cpg; print(bs_cpg.__version__)"
```

## Working with nbdev

### Understanding Cell Directives

nbdev uses special cell directives (comments starting with `#|`) to
control how cells are processed:

<table>
<colgroup>
<col style="width: 52%" />
<col style="width: 47%" />
</colgroup>
<thead>
<tr>
<th>Directive</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>#\| export</code></td>
<td>Export this cell’s code to the package module</td>
</tr>
<tr>
<td><code>#\| hide</code></td>
<td>Hide this cell in documentation but execute it</td>
</tr>
<tr>
<td><code>#\| default_exp &lt;module_name&gt;</code></td>
<td>Set the default module for subsequent <code>#\| export</code>
cells</td>
</tr>
<tr>
<td><code>#\| eval: false</code></td>
<td>Include in docs but don’t execute when running tests</td>
</tr>
</tbody>
</table>

### Example Cell Structure

``` python
#| export
def my_function(x: int) -> str:
    \"\"\"A helpful docstring.\"\"\"  
    return f"Result: {x}"
```

### Writing Tests in Notebooks

You can include tests directly after function definitions:

``` python
# Example/test
result = my_function(42)
assert result == "Result: 42"
print(f"✅ Test passed: {result}")
```

## Development Workflow

### 1. Make Changes

Edit the Jupyter notebooks in the `nbs/` directory. Use `#| export`
directive to mark cells that should be part of the package.

### 2. Prepare and Export

Run the following command to: - Export code from notebooks to Python
modules (`nbdev_export`) - Run tests in the notebooks (`nbdev_test`) -
Update the README from `index.ipynb` (`nbdev_readme`)

``` bash
nbdev_prepare
```

This is the primary command you’ll use during development.

### 3. Preview Documentation Locally

To see how your documentation looks before publishing:

``` bash
nbdev_preview
```

This starts a local web server showing the rendered documentation.

### 4. Run Specific Workflows

Individual nbdev commands (in case you don’t want to run everything):

``` bash
# Export code from notebooks
nbdev_export

# Run tests defined in notebooks
nbdev_test

# Update README from index.ipynb
nbdev_readme
```

## Adding New Functions

### Step-by-Step Example

To add a new function to the bs-cpg package:

1.  **Open the appropriate notebook** (e.g., `03_liftover_pos.ipynb` for
    coordinate functions)

2.  **Add a markdown cell** explaining the function:

    ``` markdown
    ### My New Function

    Brief description of what the function does.
    ```

3.  **Add a code cell with `#| export`**:

    ``` python
    #| export
    def my_new_function(data: list) -> int:
        \"\"\"Comprehensive docstring with Args, Returns, Examples.\"\"\" 
        # implementation
        return result
    ```

4.  **Add test/example cells** below:

    ``` python
    # Example usage
    result = my_new_function([1, 2, 3])
    assert result == 6
    ```

5.  **Run `nbdev_prepare`** to export and test:

    ``` bash
    nbdev_prepare
    ```

6.  **Verify the function** is accessible from the package:

    ``` python
    from bs_cpg.liftover_ps import my_new_function
    ```

## Managing Dependencies

### Core Dependencies

bs-cpg depends on several key packages:

- **pandas**: Data manipulation and analysis
- **pysam**: Reading/writing genomic files
- **geofetch**: Integration with GEO (Gene Expression Omnibus)
- **tenacity**: Automatic retry logic for network requests
- **liftover**: Genomic coordinate conversion

### Updating Dependencies

Edit `settings.ini` under the `install_requires` section:

``` ini
install_requires = 
    pandas
    pysam
    geofetch
    tenacity
    liftover
```

Then run:

``` bash
pip install -e .
```

## Troubleshooting

### Import Errors After Editing

If you modify code and Python still sees old versions, you may need to
reload the module:

``` python
# In Jupyter
%load_ext autoreload
%autoreload 2
```

Or reinstall in development mode:

``` bash
pip install -e . --no-deps
```

### nbdev_prepare Fails

Check that all cells marked with `#| export` can run without errors:

``` bash
nbdev_test  # Run notebook tests
```

### Module Not Found

Make sure you’ve run `nbdev_export` to generate the Python files:

``` bash
nbdev_export
```

## Publishing Updates

### Prepare for Release

1.  **Update version** in `settings.ini`:

    ``` ini
    version = 0.1.2
    ```

2.  **Run final checks**:

    ``` bash
    nbdev_prepare
    nbdev_preview
    ```

3.  **Commit and push**:

    ``` bash
    git add .
    git commit -m "v0.1.2: Add new features"
    git push
    ```

### Deploy Documentation

To deploy documentation to GitHub Pages:

``` bash
nbdev_ghp_deploy
```

### Build and Release Package

The package is configured to automatically build and release via GitHub
Actions. Manually building:

``` bash
python -m pip install build
python -m build
```

## Resources

- [nbdev Documentation](https://nbdev.fast.ai/): Complete guide to nbdev
- [Jupyter Notebook
  Documentation](https://jupyter-notebook.readthedocs.io/): Notebook
  features
- [Pandas Documentation](https://pandas.pydata.org/docs/): Data analysis
  library
- [pysam Documentation](https://pysam.readthedocs.io/): Genomic file
  handling
- [UCSC Genome Browser](https://genome.ucsc.edu/): Reference for chain
  files and genomes
