epyr.performance

Performance Optimization Module for EPyR Tools

This module provides performance optimizations for handling large EPR datasets: - Memory-efficient data loading - Chunked processing for large files - Caching system for frequently accessed data - Memory usage monitoring and optimization

Usage:

from epyr.performance import OptimizedLoader, DataCache

# Use optimized loader for large files loader = OptimizedLoader(chunk_size_mb=10, cache_enabled=True) x, y, params = loader.load_epr_file(file_path)

# Use caching for repeated access cache = DataCache(max_size_mb=100) cached_data = cache.get_or_load(file_path, load_function)

Functions

get_global_cache()

Get the global data cache instance.

get_performance_info()

Return current system performance metrics.

optimize_numpy_operations()

Configure NumPy for optimal performance.

Classes

DataCache([max_size_mb])

LRU cache for frequently accessed EPR data files.

MemoryMonitor()

Monitor and optimize memory usage during data processing.

OptimizedLoader([chunk_size_mb, cache_enabled])

Optimized data loader for large EPR datasets.

class epyr.performance.MemoryMonitor[source]

Monitor and optimize memory usage during data processing.

static get_memory_info()[source]

Get current memory usage information.

Returns:

{rss, vms, percent}

Return type:

Dict with memory info in MB

static check_memory_limit()[source]

Check if memory usage is approaching configured limit.

Returns:

True if memory usage is acceptable, False if limit exceeded

Return type:

bool

static optimize_memory()[source]

Perform memory optimization steps.

class epyr.performance.DataCache(max_size_mb=None)[source]

LRU cache for frequently accessed EPR data files.

Parameters:

max_size_mb (int | None)

__init__(max_size_mb=None)[source]

Initialize data cache.

Parameters:

max_size_mb (int | None) – Maximum cache size in MB. Uses config default if None.

get(file_path)[source]

Get cached data for file if available and still valid.

Parameters:

file_path (Path)

Return type:

Tuple | None

put(file_path, data)[source]

Cache data for file.

Parameters:
clear()[source]

Clear all cached data.

get_stats()[source]

Get cache statistics.

Return type:

Dict[str, Any]

class epyr.performance.OptimizedLoader(chunk_size_mb=None, cache_enabled=True)[source]

Optimized data loader for large EPR datasets.

Parameters:
  • chunk_size_mb (int | None)

  • cache_enabled (bool)

__init__(chunk_size_mb=None, cache_enabled=True)[source]

Initialize optimized loader.

Parameters:
  • chunk_size_mb (int | None) – Chunk size for processing large files

  • cache_enabled (bool) – Whether to use caching

load_epr_file(file_path)[source]

Load EPR file with optimization for large datasets.

Parameters:

file_path (str | Path) – Path to EPR file

Returns:

Tuple of (x_data, y_data, parameters, file_path_str)

Return type:

Tuple

load_chunked_data(file_path, chunk_processor)[source]

Load and process large data files in chunks.

Parameters:
  • file_path (str | Path) – Path to data file

  • chunk_processor (Callable) – Function to process each chunk

Returns:

Processed result from chunk_processor

Return type:

Any

epyr.performance.optimize_numpy_operations()[source]

Configure NumPy for optimal performance.

epyr.performance.get_performance_info()[source]

Return current system performance metrics.

Returns:

Keys include memory, cpu_count, and config (cache and memory-monitor settings).

Return type:

dict

epyr.performance.get_global_cache()[source]

Get the global data cache instance.

Return type:

DataCache