Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC of new meta.yml structure and ontologies #5867

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 58 additions & 45 deletions modules/nf-core/atlas/splitmerge/meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,58 +7,71 @@ keywords:
- read group
tools:
- "atlas":
description: "ATLAS, a suite of methods to accurately genotype and estimate genetic diversity"
description: "ATLAS, a suite of methods to accurately genotype and estimate genetic
diversity"
homepage: "https://bitbucket.org/wegmannlab/atlas/wiki/Home"
documentation: "https://bitbucket.org/wegmannlab/atlas/wiki/Home"
tool_dev_url: "https://bitbucket.org/wegmannlab/atlas"
doi: "10.1101/105346"
licence: "['GPL v3']"
licence: ["GPL v3"]
identifier: biotools:atlas_db
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just be atlas, not atlas_db in this case: https://bio.tools/atlas

input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bam:
type: file
description: Single input BAM file.
pattern: "*.bam"
- bai:
type: file
description: The BAI file for the input BAM file
pattern: "*.bai"
- read_group_setting:
type: file
description: |
TXT file containing the split and merge settings for
each readgroup. Each line consist of one readgroup,
single/double identifier and the maximum cycle number
of the sequencer. e.g. "RG1 single 100"
pattern: "*.txt"
- blacklist:
type: file
description: |
blacklist.txt (optional), A txt file with blacklisted read names
that should be ignored and just written to file, each on a new line
pattern: "*.txt"
- - meta:
qualifier: val
type: map
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bentsherman - does this look about right to you, in terms of what we call things here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the use of adding the qualifier when you already have the type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The qualifier can be useful to know if two modules have the same inputs and outputs.
For example, we could have the case where two different outputs have type string, but the qualifier is val and env.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to check if the modules can be chained? In that case it doesn't matter if it a val or env, as long as they are both strings

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am more thinking of a setting where modules can be interchangeable, for example for benchmarking of different tools. I would like to make sure that using aligner1 and using aligner2, both inputs and outputs will be exactly the same.
I know this can already be done with if/else and operating the channels if needed. But in the future, it would be cool if we can use this to automatically add any module which matches the requirements.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even there it doesn't matter if the string input is a val or env, as long as it is a string they are interchangeable.

The only difference is that val says "provide this value as a variable" whereas env says "provide this value as an environment variable", but those details are internal to the process

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see what you mean! Would it be best to remove the qualifier?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I think so

description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bam:
qualifier: path
type: file
description: Single input BAM file.
pattern: "*.bam"
- bai:
qualifier: path
type: file
description: The BAI file for the input BAM file
pattern: "*.bai"
- read_group_settings:
type: file
description: |
TXT file containing the split and merge settings for
each readgroup. Each line consist of one readgroup,
single/double identifier and the maximum cycle number
of the sequencer. e.g. "RG1 single 100"
pattern: "*.txt"
qualifier: path
- blacklist:
qualifier: path
type: file
description: |
blacklist.txt (optional), A txt file with blacklisted read names
that should be ignored and just written to file, each on a new line
pattern: "*.txt"
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- data:
- meta:
qualifier: val
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*_mergedReads.bam":
type: file
description: A BAM file with suffix_mergedReads.bam
pattern: "*_mergedReads.bam"
qualifier: path
- "*.txt.gz":
type: file
description: A file listing all reads that were filtered out in the merging process with suffix_ignoredReads.txt.gz
pattern: "*.txt.gz"
qualifier: path
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
- bam:
type: file
description: A BAM file with suffix_mergedReads.bam
pattern: "*_mergedReads.bam"
- filelist:
type: file
description: A file listing all reads that were filtered out in the merging process with suffix_ignoredReads.txt.gz
pattern: "*.txt.gz"
- versions.yml:
qualifier: path
type: file
description: File containing software versions
pattern: "versions.yml"
authors:
- "@merszym"
maintainers:
Expand Down
5 changes: 3 additions & 2 deletions modules/nf-core/bwa/mem/environment.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
name: bwa_mem

channels:
- conda-forge
- bioconda
- defaults

dependencies:
- bwa=0.7.18
# renovate: datasource=conda depName=bioconda/samtools
- samtools=1.20
- htslib=1.20.0
- samtools=1.20
127 changes: 85 additions & 42 deletions modules/nf-core/bwa/mem/meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,55 +17,98 @@ tools:
documentation: https://bio-bwa.sourceforge.net/bwa.shtml
arxiv: arXiv:1303.3997
licence: ["GPL-3.0-or-later"]
identifier: "biotools:bwa"
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- reads:
type: file
description: |
List of input FastQ files of size 1 and 2 for single-end and paired-end data,
respectively.
- meta2:
type: map
description: |
Groovy Map containing reference information.
e.g. [ id:'test', single_end:false ]
- index:
type: file
description: BWA genome index files
pattern: "Directory containing BWA index *.{amb,ann,bwt,pac,sa}"
- fasta:
type: file
description: Reference genome in FASTA format
pattern: "*.{fasta,fa}"
- sort_bam:
type: boolean
description: use samtools sort (true) or samtools view (false)
pattern: "true or false"
- - meta:
qualifier: val
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- reads:
qualifier: path
type: file
description: |
List of input FastQ files of size 1 and 2 for single-end and paired-end data,
respectively.
- - meta2:
qualifier: val
type: map
description: |
Groovy Map containing reference information.
e.g. [ id:'test', single_end:false ]
- index:
qualifier: path
type: file
description: BWA genome index files
pattern: "Directory containing BWA index *.{amb,ann,bwt,pac,sa}"
- - meta3:
qualifier: val
type: map
description: |
Groovy Map containing reference information.
e.g. [ id:'test', single_end:false ]
- fasta:
qualifier: path
type: file
description: Reference genome in FASTA format
pattern: "*.{fasta,fa}"
- - sort_bam:
qualifier: val
type: boolean
description: use samtools sort (true) or samtools view (false)
pattern: "true or false"
output:
- bam:
type: file
description: Output BAM file containing read alignments
pattern: "*.{bam}"
- meta:
qualifier: val
type: file
description: Output BAM file containing read alignments
pattern: "*.{bam}"
- "*.bam":
qualifier: path
type: file
description: Output BAM file containing read alignments
pattern: "*.{bam}"
- cram:
type: file
description: Output CRAM file containing read alignments
pattern: "*.{cram}"
- meta:
qualifier: val
type: file
description: Output CRAM file containing read alignments
pattern: "*.{cram}"
- "*.cram":
qualifier: path
type: file
description: Output CRAM file containing read alignments
pattern: "*.{cram}"
- csi:
type: file
description: Optional index file for BAM file
pattern: "*.{csi}"
- meta:
qualifier: val
type: file
description: Optional index file for BAM file
pattern: "*.{csi}"
- "*.csi":
qualifier: path
type: file
description: Optional index file for BAM file
pattern: "*.{csi}"
- crai:
type: file
description: Optional index file for CRAM file
pattern: "*.{crai}"
- meta:
qualifier: val
type: file
description: Optional index file for CRAM file
pattern: "*.{crai}"
- "*.crai":
qualifier: path
type: file
description: Optional index file for CRAM file
pattern: "*.{crai}"
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
- versions.yml:
qualifier: path
type: file
description: File containing software versions
pattern: "versions.yml"
authors:
- "@drpatelh"
- "@jeremy1805"
Expand Down
Loading
Loading