Entity Caching#
Entity caching improves performance by storing entities in memory to avoid repeated API calls. This is particularly useful when working with the same entities multiple times or navigating complex relationships.
Overview#
Caching is enabled through the cache_entities() context manager. Within the context, entities are cached according to your
configuration.
Note
Caching is an optional optimization feature. bfabricPy works perfectly without caching, but performance can be significantly improved for complex operations.
Basic Usage#
Cache a Single Entity Type#
Enable caching for a specific entity type with an optional size limit:
from bfabric import Bfabric
from bfabric.entities.cache.context import cache_entities
client = Bfabric.connect()
reader = client.reader
with cache_entities("sample", max_size=100):
# First call fetches from API
sample1 = reader.read_id(entity_type="sample", entity_id=123)
# Second call uses cache (no API call)
sample2 = reader.read_id(entity_type="sample", entity_id=123)
# Both are the same object
assert sample1 is sample2
Unlimited Cache Size#
Set max_size=0 for unlimited cache size:
with cache_entities("sample", max_size=0):
# Cache as many samples as needed
samples = reader.read_ids(entity_type="sample", entity_ids=list(range(1, 1000)))
Cache Multiple Entity Types#
Cache multiple entity types with the same size limit:
with cache_entities(["sample", "project", "workunit"], max_size=500):
# All three types are cached
project = reader.read_id(entity_type="project", entity_id=1)
samples = reader.read_ids(entity_type="sample", entity_ids=[1, 2, 3])
workunits = reader.read_ids(entity_type="workunit", entity_ids=[10, 20])
Different Sizes per Type#
Configure different cache sizes for each entity type:
cache_config = {
"sample": 1000, # Cache up to 1000 samples
"project": 10, # Cache up to 10 projects
"workunit": 500, # Cache up to 500 workunits
}
with cache_entities(cache_config):
# Each type has its own size limit
pass
Advanced Usage#
Nested Contexts#
Cache contexts can be nested. Inner contexts can override or extend outer ones:
# Outer context caches samples
with cache_entities("sample", max_size=100):
sample1 = reader.read_id(entity_type="sample", entity_id=1)
# Inner context adds project caching
with cache_entities("project", max_size=10):
project = reader.read_id(entity_type="project", entity_id=1)
# Sample is still cached from outer context
sample2 = reader.read_id(entity_type="sample", entity_id=1)
assert sample1 is sample2
# Project cache is cleared, sample cache remains
# project = reader.read_id(entity_type="project", entity_id=1) # Would fetch from API
When to Use Caching#
Good Use Cases#
Complex relationship navigation: Traversing entity graphs (project → samples → workunits)
Repeated queries: Same entities accessed multiple times in a workflow
Batch processing: Loading entities once for multiple operations
Performance-sensitive operations: When API latency impacts user experience
When Caching May Not Help#
One-off queries: Loading each entity only once
Large datasets: Memory may be more limiting than API speed
Frequently changing data: When you always need fresh data
Cache Behavior#
Cache Stack#
bfabricPy uses a stack-based cache system. When you nest contexts, caches are layered:
with cache_entities("sample", max_size=100):
sample1 = reader.read_id(entity_type="sample", entity_id=1)
with cache_entities("sample", max_size=10):
sample2 = reader.read_id(entity_type="sample", entity_id=1)
# Returns cached entity from outer context
assert sample1 is sample2
# New entities go to inner cache (limited to 10)
sample3 = reader.read_id(entity_type="sample", entity_id=2)
# Inner cache cleared, outer cache remains
Cache Invalidation#
Caches are automatically invalidated when the context exits:
with cache_entities("sample", max_size=100):
sample = reader.read_id(entity_type="sample", entity_id=1)
# Sample is cached
# Cache is cleared here
sample = reader.read_id(entity_type="sample", entity_id=1)
# Sample fetched from API again
Performance Considerations#
Memory vs Speed Trade-off#
Small cache size: Less memory, more API calls
Large cache size: More memory, fewer API calls
Unlimited cache (
max_size=0): Maximum speed, highest memory usage
Choosing Cache Sizes#
Consider:
Number of entities you’ll access
Entity size (larger entities = more memory per entity)
Available memory in your environment
How often entities are reused
Example:
# High-throughput scenario: larger caches
with cache_entities({"sample": 10000, "project": 100}, max_size=0):
# Process thousands of entities efficiently
pass
# Memory-constrained scenario: smaller caches
with cache_entities({"sample": 100, "project": 10}):
# Keep memory usage low
pass
Complete Example#
from bfabric import Bfabric
from bfabric.entities.cache.context import cache_entities
client = Bfabric.connect()
reader = client.reader
# Configure caches for your workflow
cache_config = {
"project": 50, # Fewer projects, larger entities
"sample": 1000, # More samples, smaller entities
"workunit": 5000, # Many workunits, small entities
}
with cache_entities(cache_config):
# Load a project
project = reader.read_id(entity_type="project", entity_id=1)
# Get all samples for the project
samples = project.refs.get("member")
print(f"Loaded {len(samples)} samples")
# Process each sample
for sample in samples:
# Get workunits (may reuse workunits from other samples)
workunits = sample.refs.get("member")
print(f"Sample {sample.id}: {len(workunits)} workunits")
# Entities are cached, so repeated accesses are fast
project2 = reader.read_id(entity_type="project", entity_id=1)
assert project is project2 # Same object, no API call
Summary#
Use
cache_entities()to enable entity caching within a contextConfigure cache sizes based on your memory and performance needs
Caching is especially beneficial for relationship navigation and repeated access
Cache contexts can be nested for fine-grained control
Caches are automatically cleared when the context exits
Advanced: Cache Internals#
For details on cache implementation, see:
This includes:
cache_entities()- Context manager for enabling cachingEntityMemoryCache- In-memory cache implementationCacheStack- Stack-based cache management
See Entity Features for complete documentation of all entity types and their features.