# Example: Create a Geofetcher instance
geo = Geofetcher(just_metadata=True)
acc = 'GSE51239'
print(f"Fetching metadata for project: {acc}")[INFO] [18:08:41] Metadata folder: /mnt/idms/home/magyary/bs-dna-methyl/nbs/project_name
The module provides a resilient wrapper around geofetch.Geofetcher that includes:
Geofetcher (name:str='', metadata_root:str='', metadata_folder:str='', just_metadata:bool=False, refresh_metadata:bool=False, config_template:str|None=None, pipeline_samples:str|None=None, pipeline_project:str|None=None, skip:int=0, acc_anno:bool=False, use_key_subset:bool=False, processed:bool=False, data_source:str='samples', filter:str|None=None, filter_size:str|None=None, geo_folder:str='.', split_experiments:bool=False, bam_folder:str='', fq_folder:str='', sra_folder:str='', bam_conversion:bool=False, picard_path:str='', input:str|None=None, const_limit_project:int=50, const_limit_discard:int=1000, attr_limit_truncate:int=500, max_soft_size:str='1GB', discard_soft:bool=False, add_dotfile:bool=False, disable_progressbar:bool=False, add_convert_modifier:bool=False, opts:object|None=None, max_prefetch_size:str|int|None=None, **kwargs:object)
Class to download or get projects, metadata, data from GEO and SRA.
Query GEO for available processed files within a project:
[INFO] [18:08:41] Metadata folder: /mnt/idms/home/magyary/bs-dna-methyl/nbs/project_name
[INFO] [18:08:41] Trying GSE51239 (not a file) as accession...
[INFO] [18:08:41] Trying GSE51239 (not a file) as accession...
[INFO] [18:08:41] Skipped 0 accessions. Starting now.
[INFO] [18:08:41] Processing accession 1 of 1: 'GSE51239'
[INFO] [18:08:43] Processed 48 samples.
[INFO] [18:08:43] Expanding metadata list...
[INFO] [18:08:43] Found SRA Project accession: SRP030612
[INFO] [18:08:43] Downloading SRP030612 sra metadata
[INFO] [18:08:46] Parsing SRA file to download SRR records
[INFO] [18:08:46] Dry run, no data will be downloaded
[INFO] [18:08:46] Finished processing 1 accession(s)
[INFO] [18:08:46] Cleaning soft files ...
[INFO] [18:08:46] Creating complete project annotation sheets and config file...
{'GSE51239_raw': Project
48 samples (showing first 20): hsperm-524-90, hsperm-530-90, hsperm-533-90, hsperm-534-90, h8c-1, h8c-2, hblast-1, hblast-2, hblast-3, hblastsingle-2, hblastsingle-5, hicm-1, hicm-2, hte-1, hte-2, hesp0-e1, hesp0-e4, hesp0-e5, hesp1-e1, hesp1-e4
Sections: name, pep_version, sample_table, experiment_metadata, sample_modifiers, description}
Create a Geofetcher instance configured for downloading processed data:
[INFO] [18:09:55] Metadata folder: /mnt/idms/home/magyary/bs-dna-methyl/nbs/project_name
[INFO] [18:09:55] Trying GSE51239 (not a file) as accession...
[INFO] [18:09:55] Trying GSE51239 (not a file) as accession...
[INFO] [18:09:55] Skipped 0 accessions. Starting now.
[INFO] [18:09:55] Processing accession 1 of 1: 'GSE51239'
[INFO] [18:09:57] Processed 48 samples.
[INFO] [18:09:57] Expanding metadata list...
[INFO] [18:09:57] Found SRA Project accession: SRP030612
[INFO] [18:09:57] Downloading SRP030612 sra metadata
[INFO] [18:09:58] Parsing SRA file to download SRR records
[INFO] [18:09:58] Getting SRR: SRR1003182 in (GSE51239)
2025-07-28T16:09:58 prefetch.3.2.1: 1) Resolving 'SRR1003182'...
2025-07-28T16:09:59 prefetch.3.2.1: Current preference is set to retrieve SRA Normalized Format files with full base quality scores
[INFO] [18:10:00] Getting SRR: SRR1003183 in (GSE51239)
2025-07-28T16:10:00 prefetch.3.2.1: 1) 'SRR1003182' is found locally
2025-07-28T16:10:00 prefetch.3.2.1: 1) Resolving 'SRR1003183'...
2025-07-28T16:10:01 prefetch.3.2.1: Current preference is set to retrieve SRA Normalized Format files with full base quality scores
2025-07-28T16:10:02 prefetch.3.2.1: 1) Downloading 'SRR1003183'...
2025-07-28T16:10:02 prefetch.3.2.1: SRA Normalized Format file is being retrieved
2025-07-28T16:10:02 prefetch.3.2.1: Downloading via HTTPS...
2025-07-28T16:10:02 prefetch.3.2.1: Continue download of 'SRR1003183' from 154660408
Once downloaded, you can explore the sample table to see available processed files: