Data Import Guide
EvoNEST provides CSV import capabilities to help you quickly migrate existing data or perform bulk uploads. This guide covers both standard and hierarchical import methods.
Prerequisites
Before importing data, ensure you have:
- Admin or import permissions in your EvoNEST instance
- CSV files with your sample data
- Basic understanding of your data structure and relationships. For a successfull merging to happen, fields have to have the same values they have in EvoNEST (for example, sex is distinguished in EvoNEST by using "male" and "female", thus "ff", "mm" or others expression would cause problems during field recognition)
- Required field mappings identified (family, genus, species, type, responsible person)
Import Methods
EvoNEST supports two import methods:
- Standard Import: For simple, flat data structures
- Hierarchical Import: For parent-child relationships (e.g., animals with subsamples)
Standard Import
Use standard import when your data doesn't have parent-child relationships, or when each row represents an independent sample.
Step 1: Prepare your CSV file
Your CSV should contain columns for sample information. Required fields include:
name,family,genus,species,type,responsible,location,date
Sample001,Araneidae,Araneus,diadematus,animal,john.doe@example.com,Berlin,2024-01-15
Sample002,Araneidae,Nephila,clavipes,silk,jane.smith@example.com,Hamburg,2024-01-20
Required fields
- family: Taxonomic family
- genus: Taxonomic genus
- species: Taxonomic species
- type: Sample type (animal, silk, subsample, other)
- responsible: Person responsible (name, email, or user ID)
Optional fields
- name: Sample identifier (auto-generated if not provided)
- location: Collection location
- date: Collection date
- notes: Additional notes
- lat/lon: Geographic coordinates
Step 2: Access the import tool
- Navigate to Samples in the main menu
- Click Import button
- Select Choose CSV File
Step 3: Upload and map fields
- Upload your CSV: Click "Select CSV File" and choose your file
- Review automatic mapping: EvoNEST will automatically suggest field mappings based on column names
- Adjust mappings: Use the dropdown menus in each column header to map CSV columns to EvoNEST fields
- Use special mappings for complex fields:
- Nomenclature: Automatically splits "Genus species" format into separate genus and species fields
- Responsible Person: Accepts names, emails, or user IDs and automatically resolves to user accounts
Step 4: Validate and import
- Review validation errors: Any issues will be highlighted in red
- Fix mapping issues: Ensure all required fields are mapped
- Click Import: Start the import process
- Monitor progress: Watch the progress bar and status updates
Hierarchical Import
Use hierarchical import when you have animals with multiple subsamples, where several rows share the same specimen but represent different samples (e.g., draglines, silk samples, etc.).
When to use hierarchical import
Perfect for data like:
- Subsamples: One or more organisms with multiple subsamples of different types
- Temporal sampling: One individual sampled at different time points
Step 1: Prepare your hierarchical CSV
Your CSV needs special identifier columns:
animal_id,sample_id,family,genus,species,responsible,sample_type,notes
MACN-Ar-001,SAMPLE-001-1,Araneidae,Micrathena,plana,researcher@lab.com,dragline,First dragline
MACN-Ar-001,SAMPLE-001-2,Araneidae,Micrathena,plana,researcher@lab.com,dragline,Second dragline
MACN-Ar-001,SAMPLE-001-3,Araneidae,Micrathena,plana,researcher@lab.com,dragline,Third dragline
MACN-Ar-002,SAMPLE-002-1,Araneidae,Nephila,clavipes,researcher@lab.com,silk,Web sample
Key requirements:
- animal_id: Unique identifier for each animal/specimen (multiple rows can share this)
- sample_id: Unique identifier for each individual sample
- Consistent animal data: All rows with the same animal_id should have identical animal-level information (family, genus, species, etc.)
Step 2: Upload and enable hierarchical mode
- Upload your CSV as normal
- Map the special fields:
- Map your animal identifier column to "Animal ID (Specimen Identifier)"
- Map your sample identifier column to "Subsample ID (Sample Identifier)"
- Hierarchical mode activates automatically when both special mappings are detected
Step 3: Review hierarchical detection
When hierarchical mode is active, you'll see:
✓ Hierarchical Import Mode Detected
Will import X animals first, then Y subsamples.
The system will:
- Identify unique animals from your data (one per unique animal_id)
- Count total subsamples (all rows in your CSV)
- Show the import plan clearly
Step 4: Complete the hierarchical import
The import happens in two phases:
Phase 1: Animal Import
- Creates or finds animals: Each unique animal_id becomes one animal record
- Handles duplicates: If an animal with the same name already exists, the existing record is used
- Uses first row data: For animals with multiple rows, data from the first occurrence is used
Phase 2: Subsample Import
- Creates all subsamples: Each row becomes a subsample record
- Links to parents: Each subsample is automatically linked to its parent animal
- Preserves relationships: Parent-child relationships are maintained
Special Field Mappings
EvoNEST provides intelligent field mappings for common data formats:
Nomenclature Mapping
Automatically splits scientific names:
Input: "Homo sapiens"
Output: genus="Homo", species="sapiens"
Input: "Quercus robur subsp. pedunculiflora"
Output: genus="Quercus", species="robur subsp. pedunculiflora"
Responsible Person Mapping
Automatically resolves to user accounts:
Input: "John Doe" → Finds user with name "John Doe"
Input: "john.doe@lab.com" → Finds user with email "john.doe@lab.com"
Input: "user_id_12345" → Finds user with ID "user_id_12345"
Custom Fields
Create custom fields on-the-fly:
- Select "Use as custom field" for any unmapped column
- The field will be created and data preserved
- Useful for project-specific metadata
Validation and Error Handling
Common validation errors:
Missing Required Fields
❌ Missing mappings for required fields: family, genus, species
Solution: Map all required fields using the dropdown menus
Data Type Mismatches
❌ Row 5: Invalid date value "not-a-date"
Solution: Check your data format matches expected types
User Not Found
❌ Row 3: User "unknown@email.com" not found
Solution: Ensure responsible persons exist as users in EvoNEST
Hierarchical Import Issues
❌ Parent animal "SPECIMEN-001" not found for subsample "SAMPLE-001-1"
Solution: Ensure animal_id values are consistent and animals import successfully
Tips for successful imports:
- Test with small files first: Start with 5-10 rows to verify mappings
- Check required fields: Ensure family, genus, species, type, and responsible are mapped
- Verify user accounts: All responsible persons should exist in EvoNEST
- Use consistent naming: Keep animal_id values identical across related rows
- Review progress: Watch for error messages during import
Advanced Features
Batch Import Strategies
For large datasets:
- Split large files: Break files into smaller chunks (< 1000 rows)
- Import animals first: In hierarchical data, you can import animals separately, then subsamples
- Use consistent IDs: Maintain consistent naming schemes across batches
Data Migration from Other Systems
When migrating from other systems:
- Export to CSV: Most systems can export to CSV format
- Map field names: Create a mapping document for your field names
- Clean data: Remove or fix invalid entries before import
- Test incrementally: Import small batches and verify results
Integration with External Databases
EvoNEST imports can be enhanced with:
- GBIF integration: Verify taxonomic names against GBIF database
- Geographic validation: Validate coordinates against known ranges
- Collection protocols: Link to standardized collection methods
Example Workflows
Workflow 1: Spider Dragline Study
animal_id,sample_id,family,genus,species,responsible,collection_date,dragline_type
MACN-Ar-47148,MJR-2847-1,Araneidae,Micrathena,plana,ramirez@museo.com,2024-12-15,major ampullate
MACN-Ar-47148,MJR-2847-2,Araneidae,Micrathena,plana,ramirez@museo.com,2024-12-15,minor ampullate
MACN-Ar-47148,MJR-2847-3,Araneidae,Micrathena,plana,ramirez@museo.com,2024-12-15,flagelliform
Result: 1 animal (Micrathena plana) with 3 dragline subsamples
Workflow 2: Museum Collection Import
name,family,genus,species,type,responsible,collection_date,location
BMNH-001,Araneidae,Araneus,diadematus,animal,curator@bmnh.ac.uk,2024-01-10,London
BMNH-002,Salticidae,Salticus,scenicus,animal,curator@bmnh.ac.uk,2024-01-11,Edinburgh
Result: 2 independent animal records
Troubleshooting
Import Fails to Start
- Check file format: Ensure file is CSV with proper encoding (UTF-8)
- Verify file size: Large files (>10MB) may need to be split
- Check permissions: Ensure you have import permissions
Partial Import Success
- Review error log: Check which records failed and why
- Fix data issues: Correct invalid entries and re-import failed records
- Check relationships: In hierarchical imports, ensure parent records exist
Performance Issues
- Reduce batch size: Import smaller files more frequently
- Check server resources: Large imports may require server optimization
- Schedule imports: Run large imports during off-peak hours
Getting Help
If you encounter issues with data import:
- Check the FAQ section for common import questions
- Review Troubleshooting Guide for technical issues
- Visit our GitHub Issues for bug reports
- Contact your system administrator for permission-related issues
For more information on managing your imported data, see:
- Sample Management - Organizing and editing samples
- Data Export - Exporting data for analysis
- Visualization - Creating charts and graphs from your data