bs-cpg

A python package for downloading and processing RRBS DNA methylation data.

This is the main documentation homepage for bs-cpg. For comprehensive module-specific documentation, API references, and examples, see the sections below. For development guidance, see the Developer Guide.

Module Documentation

Detailed documentation for each module is available in separate notebooks:

Quick Start

Basic Workflow Example

Here’s a simple workflow to get started with bs-cpg:

from bs_cpg.setup import read_sample_cpg, get_base_data_path
from bs_cpg.liftover_ps import cpg_reads, cpg_percent, liftover_positions

# 1. Load sample CpG data
df = read_sample_cpg(["chromosome", "pos"])

# 2. Validate CpG sites
reads = cpg_reads(df["chromosome"], df["pos"], "hg19", index_base=1)
percent_valid = cpg_percent(reads)
print(f"Valid CpG sites: {percent_valid:.1f}%")

# 3. Perform liftover to hg38
new_chroms, new_pos = liftover_positions(
    chromosomes=df["chromosome"],
    positions=df["pos"],
    genome_from="hg19",
    genome_to="hg38",
    index_base_from=1,
    index_base_to=1
)

Installation & Setup

Installation

Install latest from the GitHub repository:

$ pip install git+https://github.com/magistak/bs-cpg.git

or from pypi

$ pip install bs_cpg

Note: Ensure you have htslib (specifically bgzip) installed and in your PATH for reference genome processing.

Documentation

Documentation can be found hosted on this GitHub repository’s pages. Package releases are available on pypi.

from bs_cpg.download_processed import Geofetcher
geo = Geofetcher(just_metadata=True)
projects = geo.get_projects('GSE51239')
2

Liftover & Validation (bs_cpg.liftover_ps)

Perform genomic liftover while explicitly handling 0-based or 1-based indexing:

from bs_cpg.liftover_ps import liftover_positions
new_chroms, new_pos = liftover_positions(
    chromosomes=["chr1"], 
    positions=[100001], 
    genome_from="hg19", 
    genome_to="hg38", 
    index_base_from=1
)

You can also validate CpG site integrity using cpg_percent(reads).