File Processor Development Guide
Learn how to create custom file processors that handle specific file types and extensions in EvoNEST.
Overview
File processors handle file-level operations such as image processing, document handling, and binary file management. Unlike data format parsers, they focus on file type detection and basic processing rather than extracting scientific traits.
Current Architecture
Raw File → File Type Detection → File Processor → Processed File Data → Form Update
- File uploaded through form interface
- Extension detection determines file type
- File processor handles type-specific processing (thumbnails, metadata, etc.)
- Form state updated with processed file information
- User sees preview of processed file
When to Create File Processors
Create file processors when you need to:
- Handle specific file extensions not covered by existing processors
- Extract metadata from binary file formats
- Create previews or thumbnails
- Perform file-specific validation or processing
- Support new image/document formats
Note: For scientific instrument data that needs trait extraction, create a data format parser instead.
Quick Start
- Create processor function in
src/utils/file-management/processors/
- Define processor metadata with supported formats and experiment types
- Register in
src/utils/file-management/processors/index.ts
- Test with your file types
Creating a File Processor
1. Basic Processor Structure
import { FileProcessorParams } from './types';
import { updateFormValues } from './utils';
// Processor metadata for experiment type discovery
export const myFileTypeMetadata = {
name: 'MyFileTypeProcessor',
label: 'My File Type Processor',
description: 'Processes my custom file format with specialized handling',
version: '1.0.0',
author: 'Your Name',
supportedFormats: ['.myext', '.custom'],
supportedExperimentTypes: ['my_file_experiment'],
primaryExperimentType: 'my_file_experiment',
requiredFields: [],
generatedTraits: [] // File processors typically don't generate traits
};
export async function processMyFileType(params: FileProcessorParams): Promise<void> {
const { file, defaultValues, form, setFormState, setAllFileData } = params;
try {
// Read file content based on type
const fileContent = await readFileContent(file);
// Extract metadata
const metadata = await extractMetadata(file, fileContent);
// Process file (resize, convert, validate, etc.)
const processedData = await processFile(file, fileContent);
// Create updated values
const updatedValues = {
...defaultValues,
filename: file.name,
type: 'my_file_experiment',
date: new Date(file.lastModified),
name: `my_file_${file.name}`,
metadata: [
{ key: 'originalName', value: file.name },
{ key: 'fileSize', value: file.size.toString() },
{ key: 'lastModified', value: file.lastModified.toString() },
...metadata
],
dataFields: processedData
};
// Update form state
updateFormValues(form, updatedValues, setFormState, setAllFileData);
} catch (error) {
console.error(`Error processing ${file.name}:`, error);
throw new Error(`Failed to process ${file.name}: ${error.message}`);
}
}
// Helper functions
async function readFileContent(file: File): Promise<ArrayBuffer | string> {
// Read as binary for most file types
return await file.arrayBuffer();
// Or read as text for text-based formats
// return await file.text();
}
async function extractMetadata(file: File, content: ArrayBuffer | string): Promise<Array<{key: string, value: string}>> {
const metadata = [];
// Extract file-specific metadata
// Example: EXIF data from images, document properties, etc.
return metadata;
}
async function processFile(file: File, content: ArrayBuffer | string): Promise<any> {
// Perform file-specific processing
// Example: create thumbnails, convert formats, validate structure
return content; // or processed version
}
2. Image Processor Example
export async function processImageFile(params: FileProcessorParams): Promise<void> {
const { file, defaultValues, form, setFormState, setAllFileData } = params;
return new Promise<void>((resolve, reject) => {
const reader = new FileReader();
reader.onload = function () {
const img = new Image();
img.onload = function () {
// Create thumbnail
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
canvas.width = 200;
const aspectRatio = img.width / img.height;
canvas.height = Math.round(canvas.width / aspectRatio);
ctx!.drawImage(img, 0, 0, canvas.width, canvas.height);
canvas.toBlob((blob) => {
if (blob) {
const updatedValues = { ...defaultValues };
updatedValues.filename = file.name;
updatedValues.type = 'image';
updatedValues.date = new Date(file.lastModified);
updatedValues.name = `image_${file.name}`;
// Store image metadata
updatedValues.metadata = [
{ key: 'name', value: file.name },
{ key: 'type', value: file.type },
{ key: 'size', value: file.size.toString() },
{ key: 'lastModified', value: file.lastModified.toString() },
];
if (img.width && img.height) {
updatedValues.metadata.push(
{ key: 'originalWidth', value: img.width.toString() },
{ key: 'originalHeight', value: img.height.toString() }
);
}
// Store thumbnail as dataFields
updatedValues.dataFields = blob;
updateFormValues(form, updatedValues, setFormState, setAllFileData);
resolve();
} else {
reject(new Error('Failed to create thumbnail'));
}
}, 'image/jpeg', 0.85);
};
img.onerror = reject;
img.src = reader.result as string;
};
reader.onerror = reject;
reader.readAsDataURL(file);
});
}
3. Document Processor Example
export async function processDocumentFile(params: FileProcessorParams): Promise<void> {
const { file, defaultValues, existingNames, form, setFormState, setAllFileData } = params;
const updatedValues = { ...defaultValues };
updatedValues.filename = file.name;
updatedValues.type = 'document';
updatedValues.date = new Date(file.lastModified);
updatedValues.name = generateUniqueName(`document_${file.name}`, existingNames);
// For documents, we typically just store file reference
// Advanced processors might extract text content, metadata, etc.
updateFormValues(form, updatedValues, setFormState, setAllFileData);
}
4. Register Your Processor
Add to src/utils/file-management/processors/index.ts
:
// Import your processor
import { processMyFileType, myFileTypeMetadata } from './my-file-processor';
// Export for external use
export { processMyFileType, myFileTypeMetadata } from './my-file-processor';
// Register in the processor registry
fileProcessorRegistry.register(myFileTypeMetadata, processMyFileType);
Processor Interface
FileProcessorParams
export interface FileProcessorParams {
file: File; // The uploaded file
defaultValues: ExperimentFormValues; // Default form values
samples: any[]; // Available samples
existingNames: string[]; // Existing experiment names
form: any; // React Hook Form instance
setFormState: React.Dispatch<React.SetStateAction<ExperimentFormValues>>;
setAllFileData: React.Dispatch<React.SetStateAction<Array<Partial<ExperimentFormValues>>>>;
}
export type FileProcessor = (params: FileProcessorParams) => Promise<void>;
Processor Metadata
export interface ProcessorMetadata {
name: string; // Unique processor identifier
label: string; // Human-readable label for UI
description: string; // Processor description
version: string; // Processor version
author?: string; // Author name
supportedFormats: string[]; // File extensions (e.g., ['.jpg', '.png'])
supportedExperimentTypes: string[]; // Experiment types this processor handles
primaryExperimentType: string; // Default experiment type
requiredFields: string[]; // Required form fields
generatedTraits: Array<{ // Traits generated (usually empty for processors)
name: string;
unit: string;
description: string;
}>;
}
File Reading Strategies
Binary Files (Images, Documents)
async function processBinaryFile(file: File): Promise<ArrayBuffer> {
const arrayBuffer = await file.arrayBuffer();
// Process binary data
const uint8Array = new Uint8Array(arrayBuffer);
// Extract headers, metadata, etc.
const header = uint8Array.slice(0, 10);
return arrayBuffer;
}
Text Files
async function processTextFile(file: File): Promise<string> {
const text = await file.text();
// Check if it's structured data that should use a parser
const parsedData = dataFormatParserRegistry.parse(text, { filename: file.name });
if (parsedData) {
// Structured data found - delegate to data format parser
throw new Error('This file contains structured data and should be processed by a data format parser');
}
// Process as plain text
return text;
}
Stream Processing (Large Files)
async function processLargeFile(file: File): Promise<void> {
const stream = file.stream();
const reader = stream.getReader();
try {
while (true) {
const { done, value } = await reader.read();
if (done) break;
// Process chunk
processChunk(value);
}
} finally {
reader.releaseLock();
}
}
Available Utility Functions
From utils.ts
import {
generateUniqueName, // Create unique experiment names
getSuggestedExperimentType, // Get suggested type based on data
updateFormValues, // Update form state consistently
resetGeneratedNames // Reset name generator for new batch
} from './utils';
// Generate unique experiment name
const uniqueName = generateUniqueName(`${fileType}_${file.name}`, existingNames);
// Update form state
updateFormValues(form, updatedValues, setFormState, setAllFileData);
Integration with File Upload
Automatic Processor Selection
File processors are automatically selected based on file extension:
// In extension-processors.tsx
const fileExtension = path.extname(file.name).toLowerCase();
switch (fileExtension) {
case '.jpg':
case '.jpeg':
case '.png':
case '.gif':
case '.bmp':
case '.webp':
case '.svg':
await processImageFile(params);
break;
case '.tiff':
case '.tif':
await processTiffFile(params);
break;
case '.myext':
await processMyFileType(params);
break;
default:
// Try data format parsers for text files
await processPlainTextFile(params);
}
Registry Integration
The processor registry enables automatic discovery:
// Processors are automatically registered
fileProcessorRegistry.register(myFileTypeMetadata, processMyFileType);
// Discovery is automatic
const supportedTypes = fileProcessorRegistry.getSupportedExperimentTypes();
const processorInfo = fileProcessorRegistry.getMetadata('MyFileTypeProcessor');
Error Handling
File Validation
async function validateFile(file: File): Promise<void> {
// Check file size
if (file.size > 10 * 1024 * 1024) { // 10MB limit
throw new Error('File too large (maximum 10MB)');
}
// Check file type
const allowedTypes = ['.jpg', '.png', '.pdf'];
const extension = path.extname(file.name).toLowerCase();
if (!allowedTypes.includes(extension)) {
throw new Error(`Unsupported file type: ${extension}`);
}
// Check file content
const header = await file.slice(0, 10).arrayBuffer();
if (!isValidFileHeader(header)) {
throw new Error('Invalid file format');
}
}
Processing Errors
export async function processMyFileType(params: FileProcessorParams): Promise<void> {
const { file } = params;
try {
// Validate file first
await validateFile(file);
// Process file
const result = await processFile(file);
// Update form
updateFormValues(form, result, setFormState, setAllFileData);
} catch (error) {
console.error(`Error processing ${file.name}:`, error);
// Provide user-friendly error messages
if (error.message.includes('too large')) {
throw new Error(`File "${file.name}" is too large. Please use a smaller file.`);
} else if (error.message.includes('unsupported')) {
throw new Error(`File type not supported for "${file.name}". Please use a different format.`);
} else {
throw new Error(`Failed to process "${file.name}": ${error.message}`);
}
}
}
Best Practices
1. Memory Management
// For large files, process in chunks
async function processLargeImage(file: File): Promise<Blob> {
// Use canvas for image processing to avoid memory issues
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
// Set reasonable canvas size limits
const maxWidth = 2048;
const maxHeight = 2048;
// Process image...
return new Promise((resolve) => {
canvas.toBlob(resolve, 'image/jpeg', 0.8);
});
}
2. File Type Detection
function detectFileType(file: File): string {
// Use multiple detection methods
const extension = path.extname(file.name).toLowerCase();
const mimeType = file.type;
// Validate consistency
if (extension === '.jpg' && !mimeType.includes('image/jpeg')) {
console.warn('File extension and MIME type mismatch');
}
return extension;
}
3. Progressive Processing
async function processFileWithProgress(file: File, onProgress?: (progress: number) => void): Promise<any> {
const totalSteps = 5;
let currentStep = 0;
// Step 1: Validate
onProgress?.(++currentStep / totalSteps);
await validateFile(file);
// Step 2: Read
onProgress?.(++currentStep / totalSteps);
const content = await file.arrayBuffer();
// Step 3: Process
onProgress?.(++currentStep / totalSteps);
const processed = await processContent(content);
// Continue...
return processed;
}
Available File Processors
- Image Processor: JPEG, PNG, GIF, BMP, WebP, SVG - creates thumbnails and extracts dimensions
- TIFF Processor: TIFF files with specialized scientific imaging support
- Document Processor: PDF, DOC, TXT - handles document files for protocols
- Lossless Image Processor: RAW and lossless formats preserving quality
Testing Your Processor
// Test file processing
const mockFile = new File(['test content'], 'test.myext', { type: 'application/octet-stream' });
const mockParams = {
file: mockFile,
defaultValues: {},
samples: [],
existingNames: [],
form: mockForm,
setFormState: jest.fn(),
setAllFileData: jest.fn()
};
await processMyFileType(mockParams);
// Verify results
expect(mockParams.setFormState).toHaveBeenCalledWith(
expect.objectContaining({
filename: 'test.myext',
type: 'my_file_experiment'
})
);
For data format parsers that extract scientific traits, see the Data Format Parser Development Guide.