Feeder#
The bfabric-cli feeder command provides feeder operations, primarily for creating importresources from files.
Overview#
bfabric-cli feeder --help
Available subcommands:
Subcommand |
Purpose |
|---|---|
|
Create importresources for files in a storage |
Creating Importresources#
Create importresources for one or more files in a B-Fabric storage.
Basic Usage#
bfabric-cli feeder create-importresource [STORAGE_ID] [FILES]...
Parameters#
Parameter |
Required |
Description |
|---|---|---|
|
Yes |
ID of the target storage |
|
Yes |
One or more file paths to create importresources for |
Examples#
Create a single importresource:
bfabric-cli feeder create-importresource 1 /path/to/data/file.raw
Create importresources for multiple files:
bfabric-cli feeder create-importresource 1 \
/path/to/data/file1.raw \
/path/to/data/file2.raw \
/path/to/data/file3.raw
Use glob pattern for multiple files:
bfabric-cli feeder create-importresource 1 /path/to/data/*.raw
What It Does#
The command:
Validates files - Checks that the specified files exist
Parses file paths - Analyzes the path structure using the storage’s path convention
Computes file metadata - Calculates MD5 checksum, file size, file date
Creates importresources - Creates importresource entities in B-Fabric for each file
Path Convention (CompMS)#
For CompMS (Mass Spectrometry) data, the command uses the PathConventionCompMS parser, which expects files to follow a specific directory structure:
/storage_root/
├── application_name/
│ ├── container_id/
│ │ ├── sample_id/
│ │ │ └── file.raw
The parser extracts:
Application name - From the directory name
Container ID - From the container directory
Sample ID - Optional, if present
Relative path - Path relative to the storage root
Output#
The command provides feedback on:
Successful creations: “Importresource X created for file /path/to/file.raw”
Updates: “Importresource X updated for file /path/to/file.raw” (if an importresource already exists)
Errors: “Application Y not found in B-Fabric. Skipping file /path/to/file.raw”
Workflow Examples#
Initial Data Ingestion#
# 1. First, verify the storage exists
bfabric-cli api read storage id 1
# 2. Create importresources for new files
bfabric-cli feeder create-importresource 1 /data/2025-01/*.raw
Monitoring File Addition#
# Create a script to monitor for new files
#!/bin/bash
# check_new_files.sh
STORAGE_ID=1
DATA_DIR="/data/incoming"
find "$DATA_DIR" -name "*.raw" -type f | while read file; do
echo "Processing $file..."
bfabric-cli feeder create-importresource $STORAGE_ID "$file"
done
Batch Processing Multiple Storages#
# Process files across multiple storages
for storage_id in 1 2 3; do
bfabric-cli feeder create-importresource $storage_id /data/storage_$storage_id/*.raw
done
Finding Storage Information#
Before creating importresources, verify your storage configuration:
List All Storages#
bfabric-cli api read storage --limit 20
Show Specific Storage#
bfabric-cli api read storage id 1
Check Storage Path Convention#
The storage information will show:
Storage ID and name
Base URL/path
Path convention type (e.g., CompMS)
Working with Importresources#
After creating importresources, you can work with them:
List Importresources#
# List all importresources
bfabric-cli api read importresource --limit 50
# Filter by storage
bfabric-cli api read importresource storageid 1
# Filter by date
bfabric-cli api read importresource createdafter 2024-12-01 --limit 20
Check Importresource Details#
# Show specific importresource
bfabric-cli api read importresource id 12345
Tips and Best Practices#
Verify Files Before Processing#
# Check files exist before creating importresources
ls -lh /data/incoming/*.raw
# Verify file integrity
md5sum /data/incoming/*.raw
Use Absolute Paths#
# Use absolute paths to avoid ambiguity
bfabric-cli feeder create-importresource 1 /full/path/to/data/file.raw
Process in Batches#
# For large numbers of files, process in batches
find /data/incoming -name "*.raw" -type f | head -100 | while read file; do
bfabric-cli feeder create-importresource 1 "$file"
done
Monitor for Errors#
# Capture and review errors
bfabric-cli feeder create-importresource 1 /data/*.raw 2> errors.log
# Review any failures
grep "error\|Error\|ERROR" errors.log
Test on Small Batch First#
# Test with a few files before processing everything
bfabric-cli feeder create-importresource 1 /data/test/*.raw
# If successful, process the full batch
bfabric-cli feeder create-importresource 1 /data/production/*.raw
Common Issues#
Storage Not Found#
Error: Storage with ID X not found
Solution: Verify the storage exists:
bfabric-cli api read storage id <storage-id>
Files Do Not Exist#
Error: Files /path/to/file1.raw, /path/to/file2.raw do not exist
Solution: Check file paths and permissions:
ls -la /path/to/
Application Not Found#
Error: Application X not found in B-Fabric. Skipping file /path/to/file.raw
Solution: The application derived from the path doesn’t exist in B-Fabric. Options:
Create the application in B-Fabric
Rename the directory to match an existing application
Verify the path convention is correct
# Check available applications
bfabric-cli api read application
Path Convention Mismatch#
Error: Files don’t follow the expected path structure
Solution: Ensure files are organized according to the storage’s path convention:
# Check storage configuration
bfabric-cli api read storage id <storage-id>
# Verify file structure
tree /data/
Integration with Data Ingestion Workflows#
The feeder command is typically used as part of a larger data ingestion pipeline:
File Transfer: Data is transferred to the storage location
Validation: File integrity is verified (checksums, sizes)
Importresource Creation: Feeder command creates importresources
Import Process: B-Fabric imports the data based on importresources
Sample Creation: Associated samples are created/updated
Analysis: Data becomes available for analysis
Example Ingestion Pipeline#
#!/bin/bash
# ingest_data.sh
STORAGE_ID=1
SOURCE_DIR="/data/incoming"
PROCESSED_DIR="/data/processed"
# 1. Validate files
echo "Validating files..."
for file in "$SOURCE_DIR"/*.raw; do
if [ ! -f "$file" ]; then
echo "Error: $file does not exist"
exit 1
fi
done
# 2. Create importresources
echo "Creating importresources..."
bfabric-cli feeder create-importresource $STORAGE_ID "$SOURCE_DIR"/*.raw
# 3. Move to processed directory
echo "Moving files to processed..."
mv "$SOURCE_DIR"/*.raw "$PROCESSED_DIR/"
echo "Ingestion complete!"
See Also#
API Operations - Generic CRUD operations for working with importresources
Storage Information - Storage entity documentation
Python API - Using bfabric in Python for custom feeder logic