3. Data, Science & AI
Found 9989 skills
nemo-curator
davila7
GPU-accelerated data curation tool for preparing high-quality training datasets for LLMs, featuring deduplication, quality filtering, and content safety checks.
fluidsim
davila7
Framework for Python-based computational fluid dynamics simulations, supporting Navier-Stokes equations, turbulence analysis, and HPC with FFT methods.
fine-tuning-with-trl
davila7
Fine-tunes LLMs using RLHF techniques (SFT, DPO, PPO) with HuggingFace Transformers for preference alignment and reward optimization.
sparse-autoencoder-training
davila7
Guides training and analysis of Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features for model analysis.
optimizing-attention-flash
davila7
Accelerates transformer training/inference with 2-4x speedup and 10-20x memory reduction using Flash Attention for long sequences.
torch-geometric
davila7
Provides tools for building and training Graph Neural Networks (GNNs) for node classification, link prediction, and molecular property prediction using PyTorch Geometric.
pennylane
davila7
Python library for quantum machine learning, quantum circuit design, and hybrid quantum-classical model training with automatic differentiation and PyTorch integration.
diffdock
davila7
Predicts protein-ligand binding poses and confidence scores using diffusion models, supporting PDB and SMILES inputs for structure-based drug design.
segment-anything-model
davila7
Provides zero-shot image segmentation using points, boxes, or masks as prompts, or automatically generates all object masks in an image.
exploratory-data-analysis
davila7
Automates exploratory data analysis for scientific datasets, detecting file types and generating structured reports with quality metrics and recommendations.
drugbank-database
davila7
Provides access to and analysis of comprehensive drug data from DrugBank, including properties, interactions, targets, and pharmacology for research and discovery.
geniml
davila7
Enables machine learning analysis of genomic regions using BED files, including region embeddings and scATAC-seq processing.
zinc-database
davila7
Accesses ZINC database for drug discovery, enabling compound searches by ID, SMILES, similarity, and 3D structure analysis for virtual screening.
cobrapy
davila7
Enables constraint-based metabolic modeling with FBA, FVA, gene knockouts, and SBML support for systems biology and metabolic engineering analysis.
senior-prompt-engineer
davila7
Provides advanced prompt engineering for LLM optimization, RAG, agent design, and structured outputs to enhance AI product performance.
biorxiv-database
davila7
Efficiently search bioRxiv preprints by keywords, authors, or date ranges, retrieving metadata and PDFs for scientific literature reviews.
uniprot-database
davila7
Provides direct REST API access for protein data retrieval from UniProt, including searches, FASTA sequences, and ID mapping for bioinformatics workflows.
llama-factory
davila7
Provides no-code web interface for fine-tuning large language models with quantization support and multimodal capabilities.
labarchive-integration
davila7
Provides API integration for electronic lab notebooks (ELN) to manage entries, attachments, and workflows with scientific tools including Jupyter and REDCap.
pubmed-database
davila7
Provides direct REST API access to PubMed for querying biomedical literature, supporting advanced Boolean/MeSH queries, batch processing, and citation management.
scientific-schematics
davila7
Creates publication-quality scientific diagrams with AI-driven refinement for neural networks, biological pathways, and complex visualizations.
alphafold-database
davila7
Accesses AlphaFold's database of AI-predicted protein structures, enabling retrieval by UniProt ID and analysis of confidence metrics for structural biology research.
seaborn
davila7
Statistical visualization library for creating scatter, box, violin, heatmap, and regression plots for exploratory data analysis and publication-ready figures.
sentencepiece
davila7
Language-independent text tokenizer using BPE and Unigram algorithms, optimized for speed and multilingual support in AI models.