Builder Module¶
Module for preparing membrane protein systems with automated lipid bilayer construction, solvation, and parametrization. This module includes:
- Membrane system building with packmol-memgen
- Custom lipid composition (asymmetric membranes supported)
- Force field selection and validation
- Salt addition and ionic strength control
- Two-stage Propka workflow support
- Automated parametrization with pdb4amber and tleap
Import¶
from gatewizard.core.builder import Builder
from gatewizard.tools.force_fields import ForceFieldManager
Class: Builder¶
Main class for building membrane protein systems with complete control over lipid composition, force fields, and system parameters.
Constructor¶
Parameters: None
Returns: Builder instance
Default Configuration:
| Parameter | Default | Description |
|---|---|---|
water_model | "tip3p" | Water model for solvation |
protein_ff | "ff19SB" | Protein force field |
lipid_ff | "lipid21" | Lipid force field |
preoriented | True | Protein already oriented for membrane insertion |
parametrize | True | Run parametrization with tleap |
salt_concentration | 0.15 | Salt concentration in M (molar) |
cation | "K+" | Cation type for salt |
anion | "Cl-" | Anion type for salt |
dist_wat | 17.5 | Water layer thickness in Å |
notprotonate | False | Skip protonation (preserve residue names) |
Example 1: Basic Configuration¶
from gatewizard.core.builder import Builder
builder = Builder()
print(f"Default water model: {builder.config['water_model']}")
print(f"Default protein force field: {builder.config['protein_ff']}")
print(f"Default lipid parameters: {builder.config['lipid_ff']}")
Configuration Methods¶
Method: set_configuration()¶
Update configuration parameters for system building.
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
water_model | str | "tip3p" | Water model (tip3p, tip4p, spce, opc) |
protein_ff | str | "ff19SB" | Protein force field (ff14SB, ff19SB) |
lipid_ff | str | "lipid21" | Lipid force field (lipid17, lipid21) |
preoriented | bool | True | Whether protein is pre-oriented |
parametrize | bool | True | Run parametrization after packing |
salt_concentration | float | 0.15 | Salt concentration in M |
cation | str | "K+" | Cation type (Na+, K+, etc.) |
anion | str | "Cl-" | Anion type (Cl-, Br-, etc.) |
dist_wat | float | 17.5 | Water layer thickness in Å |
notprotonate | bool | False | Skip protonation during parametrization |
add_salt | bool | True | Whether to add salt to system |
Returns: None
Example 2: Custom Configuration¶
from gatewizard.core.builder import Builder
builder = Builder()
# Configure for specific system
builder.set_configuration(
water_model="tip3p",
protein_ff="ff14SB",
lipid_ff="lipid21",
salt_concentration=0.15,
cation="Na+",
anion="Cl-",
dist_wat=20.0, # Larger water layer
preoriented=True
)
Example 3: Available Water Models¶
from gatewizard.tools.force_fields import ForceFieldManager
ff_manager = ForceFieldManager()
# Get available water models
water_models = ff_manager.get_water_models()
print("\nAvailable Water Models:")
for water in water_models:
print(f" - {water}")
print(f"\nTotal: {len(water_models)} water models available")
Similarly can be done for lipids:
Example 4: Available Lipid Models¶
from gatewizard.tools.force_fields import ForceFieldManager
ff_manager = ForceFieldManager()
# Get all available lipids
lipids = ff_manager.get_available_lipids()
print(f"Total available lipids: {len(lipids)}\n")
for lipid in lipids:
print(f" - {lipid}")
print(f"\nTotal: {len(lipids)} lipid models available")
Example 5: Available Protein Force Fields¶
from gatewizard.tools.force_fields import ForceFieldManager
ff_manager = ForceFieldManager()
# Get available protein force fields
protein_ffs = ff_manager.get_protein_force_fields()
print("\nAvailable Protein Force Fields:")
for protein_ff in protein_ffs:
print(f" - {protein_ff}")
print(f"\nTotal: {len(protein_ffs)} protein force fields available")
Example 6: Available Lipid Force Fields¶
from gatewizard.tools.force_fields import ForceFieldManager
ff_manager = ForceFieldManager()
# Get available lipid force fields
lipid_ffs = ff_manager.get_lipid_force_fields()
print("\nAvailable Lipid Force Fields:")
for lipid_ff in lipid_ffs:
print(f" - {lipid_ff}")
print(f"\nTotal: {len(lipid_ffs)} lipid force fields available")
Validation Methods¶
Before preparing systems, you can validate your inputs and check force field compatibility.
Method: validate_system_inputs()¶
Validate all inputs before system preparation.
validate_system_inputs(
pdb_file: str,
upper_lipids: List[str],
lower_lipids: List[str],
lipid_ratios: str = "",
**kwargs
) -> Tuple[bool, str]
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
pdb_file | str | Yes | Path to input PDB file |
upper_lipids | List[str] | Yes | List of lipids for upper leaflet |
lower_lipids | List[str] | Yes | List of lipids for lower leaflet |
lipid_ratios | str | No | Lipid ratios string |
**kwargs | Any | No | Additional parameters to validate |
Returns: Tuple[bool, str]
bool: Validation statusstr: Error message (empty if valid)
Validation Checks:
- PDB file exists and is readable
- All lipid types are valid/available
- Lipid ratios match number of lipids
- Ratios are positive numbers
- Force field combinations are compatible
- Ion types are valid
Example 7: Input Validation¶
from gatewizard.core.builder import Builder
builder = Builder()
# Validate inputs before preparation
valid, msg = builder.validate_system_inputs(
pdb_file="protein_protonated_prepared.pdb",
upper_lipids=["POPC", "POPE"],
lower_lipids=["POPC", "POPE"],
lipid_ratios="7:3//7:3",
water_model="tip3p",
protein_ff="ff14SB",
lipid_ff="lipid21"
)
if valid:
if "WARNING" in msg:
print(f"[!] Inputs valid with warning: {msg}")
print("You may proceed at your own risk")
else:
print("[OK] All inputs are valid, proceed with preparation")
# Now call prepare_system()
else:
print(f"[ERROR] Validation failed: {msg}")
# Fix issues before proceeding
Example 8: Force Field Validation¶
from gatewizard.tools.force_fields import ForceFieldManager
ff_manager = ForceFieldManager()
valid, message, is_warning = ff_manager.validate_combination("tip3p", "ff14SB", "lipid21")
if valid:
if is_warning:
print(f"[!] Warning: {message}")
else:
print("[OK] Force field combination is compatible")
else:
print(f"[ERROR] Incompatible: {message}")
Core Preparation Methods¶
Method: prepare_system()¶
Prepare a complete membrane protein system with lipid bilayer, water, and ions.
prepare_system(
pdb_file: str,
working_dir: str,
upper_lipids: List[str],
lower_lipids: List[str],
lipid_ratios: str = "",
**kwargs
) -> Tuple[bool, str, Optional[Path]]
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
pdb_file | str | Yes | Path to input PDB file (oriented protein) |
working_dir | str | Yes | Working directory for output files |
upper_lipids | List[str] | Yes | List of lipid types for upper leaflet |
lower_lipids | List[str] | Yes | List of lipid types for lower leaflet |
lipid_ratios | str | No | Lipid molar ratios (format: "ratio1:ratio2//ratio3:ratio4") |
output_folder_name | str | No | Custom output folder name (kwarg) |
wait | bool | No | Block until the job completes or errors (default False) |
wait_timeout | float | No | Maximum seconds to wait when wait=True (None = unlimited) |
wait_poll_interval | float | No | Seconds between status checks (default 5.0) |
wait_verbose | bool | No | Print elapsed-time progress while waiting (default True) |
**kwargs | Any | No | Additional configuration parameters (override defaults) |
Returns: Tuple[bool, str, Optional[Path]]
bool: Success statusstr: Status messageOptional[Path]: Job directory path (None if failed)
Lipid Ratios Format:
The lipid_ratios string specifies molar ratios for each leaflet:
- Format:
"upper_ratio1:upper_ratio2//lower_ratio1:lower_ratio2" - Ratios must match the order of lipids in the lipid lists
- Ratios are normalized automatically (don't need to sum to 1.0)
- Use
//to separate upper and lower leaflet ratios
Available Lipids:
| Category | Lipid Types |
|---|---|
| Phosphatidylcholine (PC) | DHPC, DLPC, DMPC, DPPC, DSPC, DOPC, POPC, PLPC, SOPC |
| Phosphatidylethanolamine (PE) | DHPE, DLPE, DMPE, DPPE, DSPE, DOPE, POPE, PLPE, SOPE |
| Phosphatidylserine (PS) | DHPS, DLPS, DMPS, DPPS, DSPS, DOPS, POPS, PLPS, SOPS |
| Phosphatidylglycerol (PG) | DHPG, DLPG, DMPG, DPPG, DSPG, DOPG, POPG, PLPG, SOPG |
| Phosphatidic Acid (PA) | DHPA, DLPA, DMPA, DPPA, DSPA, DOPA, POPA, PLPA, SOPA |
| Sphingomyelins | PSM, ASM, LSM, MSM, HSM, OSM |
| Cholesterol | CHL1 |
| Other | CARDIOLIPIN, PIP, PIP2 |
Output Files:
In {output_folder_name}/ directory:
bilayer_*.pdb- Packed system (protein in membrane)bilayer_*_lipid.pdb- Lipid bilayer with CRYST1 info (for equilibration)system.prmtop- AMBER topology file (if parametrize=True)system.inpcrd- AMBER coordinate file (if parametrize=True)system_solv.pdb- Solvated system in PDB formatsystem_4amber.pdb- PDB prepared by pdb4ambertleap.in- tleap input scriptlogs/preparation.log- Main execution loglogs/parametrization.log- Parametrization logstatus.json- Job status trackingrun_preparation.sh- Execution script
Raises:
FileNotFoundError- If input PDB file doesn't existBuilderError- If validation fails or system preparation fails
Note: By default, system preparation runs in the background and the method returns immediately. Pass wait=True to block until the job finishes — useful for scripting sequential preparations. For asynchronous monitoring, use the log files or the JobMonitor class (see examples below).
Example 9: Simple Symmetric Membrane¶
from gatewizard.core.builder import Builder
builder = Builder()
# Configure system
builder.set_configuration(
water_model="tip3p",
protein_ff="ff14SB",
lipid_ff="lipid21",
salt_concentration=0.15,
cation="K+",
anion="Cl-"
)
# Prepare system with 100% POPC (symmetric)
success, message, job_dir = builder.prepare_system(
pdb_file="protein_protonated_prepared.pdb",
working_dir="./systems",
upper_lipids=["POPC"],
lower_lipids=["POPC"],
lipid_ratios="1//1", # 100% POPC both leaflets
output_folder_name="popc_membrane"
)
if success:
print(f"✓ System preparation started in background")
print(f" {message}")
print(f" Job directory: {job_dir}")
print(f" Monitor progress: {job_dir / 'preparation.log'}")
print(f" When complete, files will be at:")
print(f" - Topology: {job_dir / 'system.prmtop'}")
print(f" - Coordinates: {job_dir / 'system.inpcrd'}")
else:
print(f"✗ Preparation failed: {message}")
Example 10: Monitoring Job Progress¶
from gatewizard.core.job_monitor import JobMonitor
from pathlib import Path
# Create monitor for your working directory
monitor = JobMonitor(working_directory=Path("./systems"))
# Scan for jobs
monitor.scan_for_jobs(force=True)
# Get active jobs
active_jobs = monitor.get_active_jobs()
if active_jobs:
print(f"✓ Found {len(active_jobs)} active job(s)")
for job_id, job_info in active_jobs.items():
print(f"Job: {job_info.job_dir.name}")
print(f" Status: {job_info.status.value}")
print(f" Progress: {job_info.progress:.1f}%")
print(f" Current step: {job_info.current_step}")
print(f" Elapsed: {job_info.elapsed_time:.1f}s")
# Show completed steps
if job_info.steps_completed:
print(f" Completed steps:")
for step in job_info.steps_completed:
print(f" ✓ {step}")
else:
print("No active jobs found")
# Check for completed jobs
completed_jobs = monitor.get_completed_jobs()
if completed_jobs:
print(f"✓ Found {len(completed_jobs)} completed job(s)")
for job_id, job_info in completed_jobs.items():
print(f" - {job_info.job_dir.name}: {job_info.status.value}")
Example 11: Asymmetric Membrane with Multiple Lipids¶
from gatewizard.core.builder import Builder
builder = Builder()
# Configure for asymmetric membrane
builder.set_configuration(
water_model="tip3p",
protein_ff="ff14SB",
lipid_ff="lipid21",
# Note: When using anionic lipids (POPS, POPG, etc.), the system may require
# a higher salt concentration for neutralization. If you get an error like:
# "The concentration of ions required to neutralize the system is higher than
# the concentration specified", increase salt_concentration or use add_salt=False.
# POPS is anionic (-1 charge), so 50% POPS in lower leaflet adds significant
# negative charge requiring more cations for neutralization.
salt_concentration=0.5, # Increased from default 0.15 M due to anionic lipids
dist_wat=20.0 # Thicker water layer
)
# Upper leaflet: 70% POPC + 30% cholesterol
# Lower leaflet: 50% POPE + 50% POPS (anionic)
success, message, job_dir = builder.prepare_system(
pdb_file="protein_protonated_prepared.pdb",
working_dir="./systems",
upper_lipids=["POPC", "CHL1"],
lower_lipids=["POPE", "POPS"],
lipid_ratios="7:3//5:5", # Ratios normalized automatically
output_folder_name="asymmetric_membrane"
)
if success:
print(f"✓ System preparation started in background")
print(f" {message}")
print(f" Upper leaflet: 70% POPC, 30% CHL1")
print(f" Lower leaflet: 50% POPE, 50% POPS")
print(f" Job directory: {job_dir}")
print(f" Monitor: {job_dir / 'logs/preparation.log'}")
else:
print(f"✗ Preparation failed: {message}")
Example 12: Complex Composition (Plasma Membrane Mimic)¶
from gatewizard.core.builder import Builder
builder = Builder()
# Plasma membrane-like composition
# Upper: PC-rich with cholesterol
# Lower: PE/PS-rich (cytoplasmic side)
success, message, job_dir = builder.prepare_system(
pdb_file="protein_protonated_prepared.pdb",
working_dir="./systems",
upper_lipids=["POPC", "POPE", "CHL1"],
lower_lipids=["POPE", "POPS", "CHL1"],
lipid_ratios="5:2:3//4:4:2", # Upper: 50% POPC, 20% POPE, 30% CHL1
# Lower: 40% POPE, 40% POPS, 20% CHL1
output_folder_name="plasma_membrane",
salt_concentration=0.5,
cation="Na+", # Use sodium instead of potassium
dist_wat=25.0 # Extra water for large protein
)
if success:
print(f"✓ System preparation started in background")
print(f" {message}")
print(f" Job directory: {job_dir}")
print(f" Monitor: {job_dir / 'logs/preparation.log'}")
else:
print(f"✗ Preparation failed: {message}")
Example 13: Packing Only (No Parametrization)¶
from gatewizard.core.builder import Builder
builder = Builder()
# Only pack the system, don't parametrize
# Useful for visual inspection before parametrization
success, message, job_dir = builder.prepare_system(
pdb_file="protein_protonated_prepared.pdb",
working_dir="./systems",
upper_lipids=["POPC"],
lower_lipids=["POPC"],
lipid_ratios="1//1",
output_folder_name="packed_only",
parametrize=False # Skip parametrization
)
if success:
print(f"✓ Packing started in background (no parametrization)")
print(f" {message}")
print(f" Job directory: {job_dir}")
print(f" Monitor: {job_dir / 'logs/preparation.log'}")
print(f" When complete, inspect: {job_dir / 'bilayer_*.pdb'}")
else:
print(f"✗ Preparation failed: {message}")
Example 14: Custom Salt Concentration¶
from gatewizard.core.builder import Builder
builder = Builder()
# High salt concentration for ionic strength studies
success, message, job_dir = builder.prepare_system(
pdb_file="protein_protonated_prepared.pdb",
working_dir="./systems",
upper_lipids=["POPC", "POPS"],
lower_lipids=["POPC", "POPS"],
lipid_ratios="8:2//8:2", # 80% POPC, 20% POPS
output_folder_name="high_salt",
salt_concentration=2.0, # 2000 mM (high salt)
cation="Na+",
anion="Cl-"
)
if success:
print(f"✓ System preparation started in background")
print(f" {message}")
print(f" High salt: 2.0 M NaCl")
print(f" Job directory: {job_dir}")
print(f" Monitor: {job_dir / 'logs/preparation.log'}")
else:
print(f"✗ Preparation failed: {message}")
Example 15: No Salt (Charge Neutralization Only)¶
from gatewizard.core.builder import Builder
builder = Builder()
# Neutralize system charges only (no extra salt)
success, message, job_dir = builder.prepare_system(
pdb_file="protein_protonated_prepared.pdb",
working_dir="./systems",
upper_lipids=["POPC"],
lower_lipids=["POPC"],
lipid_ratios="1//1",
output_folder_name="no_salt",
salt_concentration=0.0, # Only neutralize, no extra salt
add_salt=True # Still add ions for neutralization
)
if success:
print(f"✓ System preparation started in background")
print(f" {message}")
print(f" Neutralization only, no extra salt")
print(f" Job directory: {job_dir}")
print(f" Monitor: {job_dir / 'logs/preparation.log'}")
else:
print(f"✗ Preparation failed: {message}")
Method: wait_for_completion()¶
Block until a preparation job completes or fails by polling status.json.
wait_for_completion(
job_dir,
poll_interval: float = 5.0,
timeout: Optional[float] = None,
verbose: bool = True,
) -> Tuple[bool, str]
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
job_dir | Path \| str | — | Path to the job directory |
poll_interval | float | 5.0 | Seconds between status checks |
timeout | float \| None | None | Maximum seconds to wait (None = no limit) |
verbose | bool | True | Print elapsed-time progress to stdout |
Returns: Tuple[bool, str] — (success, message)
Example 25: Blocking (synchronous) Preparation with wait=True¶
"""
Builder Example 25: Blocking (synchronous) preparation with wait=True
Demonstrates how to run prepare_system() in blocking mode so that the
Python script does not continue until the job finishes (or errors).
This is especially useful for scripting multiple sequential preparations.
Two approaches are shown:
A) Inline ``wait=True`` inside prepare_system()
B) Launch in background + call wait_for_completion() later
NOTE: This example requires AmberTools and packmol-memgen.
It uses the real test PDB file tests/2MVJ_2ligs.pdb.
"""
from gatewizard.core.builder import Builder
from gatewizard.tools.ligand_parametrization import (
detect_ligands,
parametrize_all_ligands,
)
pdb_file = "tests/2MVJ_2ligs.pdb"
working_dir = "./systems"
# ── Step 1: Detect and parametrize ligands (same as example 23) ──────
print("Step 1: Detecting ligands...")
ligands = detect_ligands(pdb_file)
for lig in ligands:
print(f" Found: {lig.name} ({lig.num_atoms} atoms, {lig.formula})")
print("\nStep 2: Parametrizing ligands...")
ligand_results = parametrize_all_ligands(
pdb_file=pdb_file,
output_dir=f"{working_dir}/ligand_params",
charges={"AAA": 0, "BBB": 0},
charge_method="bcc",
)
print(f" Parametrized: {list(ligand_results.keys())}")
# ── Step 3: Configure builder ────────────────────────────────────────
builder = Builder()
builder.set_configuration(
water_model="tip3p",
protein_ff="ff14SB",
lipid_ff="lipid21",
preoriented=True,
parametrize=True,
salt_concentration=0.15,
dist_wat=17.5,
notprotonate=True,
ligand_params=ligand_results,
)
# ── Approach A: inline wait ──────────────────────────────────────────
# With wait=True the call blocks until the job finishes.
# prepare_system returns (success, message, job_dir) where
# success reflects the *final* outcome, not just "launched OK".
#
# NOTE: The actual preparation is long-running. Set a short timeout
# so the example returns quickly during tests; remove or increase
# the timeout for real production runs.
print("\n── Approach A: prepare_system with wait=True ──")
success_a, message_a, job_dir_a = builder.prepare_system(
pdb_file=pdb_file,
working_dir=working_dir,
upper_lipids=["POPC"],
lower_lipids=["POPC"],
lipid_ratios="1//1",
output_folder_name="membrane_2MVJ_wait",
wait=True, # ← block until done or error
wait_timeout=10, # short timeout for demo (use 3600+ for real runs)
wait_poll_interval=2, # check every 2 s
wait_verbose=True, # print elapsed time
)
print(f"Result : {success_a}")
print(f"Message: {message_a}")
# ── Approach B: launch then wait later ───────────────────────────────
print("\n── Approach B: launch + wait_for_completion() ──")
success_b, message_b, job_dir_b = builder.prepare_system(
pdb_file=pdb_file,
working_dir=working_dir,
upper_lipids=["POPC"],
lower_lipids=["POPC"],
lipid_ratios="1//1",
output_folder_name="membrane_2MVJ_bg",
)
if success_b and job_dir_b is not None:
# Do other work here …
print("Doing other work while system builds …")
# Then block until that specific job is done
completed, wait_msg = builder.wait_for_completion(
job_dir_b,
poll_interval=2,
timeout=10, # short timeout for demo
verbose=True,
)
print(f"Completed: {completed}")
print(f"Message : {wait_msg}")
else:
print(f"Not launched: {message_b}")
print("\nExample 25 finished.")
Job Monitoring¶
Since system preparation runs in the background, GateWizard provides the JobMonitor class to track progress, check status, and manage running jobs.
Class: JobMonitor¶
Monitor and track system preparation jobs.
Example 16: JobMonitor class¶
from gatewizard.core.job_monitor import JobMonitor
from pathlib import Path
monitor = JobMonitor(working_directory=Path("./systems"))
Constructor Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
working_directory | Path or str | Path.cwd() | Directory to monitor for jobs |
Returns: JobMonitor instance
Method: scan_for_jobs()¶
Scan the working directory for preparation jobs.
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
force | bool | False | Force scan even if within scan interval (3 seconds) |
Returns: None (updates internal job list)
Description: Scans the working directory for subdirectories containing status.json files, which indicate active or completed preparation jobs.
Method: get_active_jobs()¶
Get all currently running jobs.
Returns: Dict[str, JobInfo]
- Dictionary mapping job IDs to
JobInfoobjects - Only includes jobs with status
RUNNING
Method: get_completed_jobs()¶
Get all completed or errored jobs.
Returns: Dict[str, JobInfo]
- Dictionary mapping job IDs to
JobInfoobjects - Includes jobs with status
COMPLETEDorERROR
Method: get_job()¶
Get specific job by ID.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
job_id | str | Yes | Job identifier (usually the full job directory path) |
Returns: Optional[JobInfo]
JobInfoobject if job existsNoneif job not found
Method: refresh_job()¶
Force refresh of a specific job's status.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
job_id | str | Yes | Job identifier to refresh |
Returns: bool
Trueif job was refreshed successfullyFalseif job not found or refresh failed
JobInfo Object¶
Each job is represented by a JobInfo object with the following attributes:
| Attribute | Type | Description |
|---|---|---|
job_id | str | Unique job identifier |
job_dir | Path | Job directory path |
status | JobStatus | Current status (RUNNING, COMPLETED, ERROR, UNKNOWN) |
progress | float | Progress percentage (0.0 - 100.0) |
current_step | str | Name of current processing step |
steps_completed | List[str] | List of completed step names |
elapsed_time | float | Elapsed time in seconds |
start_time | datetime | Job start timestamp |
end_time | Optional[datetime] | Job end timestamp (None if running) |
JobStatus Values:
RUNNING- Job is currently executingCOMPLETED- Job finished successfullyERROR- Job encountered an errorUNKNOWN- Status cannot be determined
Example 17: Real-time Progress Tracking¶
from gatewizard.core.job_monitor import JobMonitor
from pathlib import Path
import time
# Start a system preparation (runs in background)
from gatewizard.core.builder import Builder
builder = Builder()
success, message, job_dir = builder.prepare_system(
pdb_file="protein_protonated_prepared.pdb",
working_dir="./systems",
upper_lipids=["POPC"],
lower_lipids=["POPC"],
lipid_ratios="1//1"
)
if success:
print(f"✓ Job started: {job_dir}")
# Monitor progress in real-time
monitor = JobMonitor(working_directory=Path("./systems"))
while True:
monitor.scan_for_jobs(force=True)
active_jobs = monitor.get_active_jobs()
if not active_jobs:
# Job completed
completed = monitor.get_completed_jobs()
if completed:
for job_id, job_info in completed.items():
if str(job_dir) in job_id:
print(f"\n✓ Job completed: {job_info.status.value}")
print(f" Total time: {job_info.elapsed_time:.1f}s")
break
break
# Show progress
for job_id, job_info in active_jobs.items():
if str(job_dir) in job_id:
print(f"\rProgress: {job_info.progress:.1f}% - {job_info.current_step}", end="", flush=True)
time.sleep(2) # Check every 2 seconds
Example 18: Monitoring Batch Job Management¶
from gatewizard.core.job_monitor import JobMonitor
from pathlib import Path
# Monitor all jobs in a directory
monitor = JobMonitor(working_directory=Path("./systems"))
monitor.scan_for_jobs(force=True)
# Check active jobs
active_jobs = monitor.get_active_jobs()
print(f"\n{'='*60}")
print(f"ACTIVE JOBS: {len(active_jobs)}")
print(f"{'='*60}")
for job_id, job_info in active_jobs.items():
print(f"\n📁 {job_info.job_dir.name}")
print(f" Status: {job_info.status.value}")
print(f" Progress: {job_info.progress:.1f}%")
print(f" Current: {job_info.current_step}")
print(f" Runtime: {job_info.elapsed_time:.0f}s")
if job_info.steps_completed:
print(f" Completed steps:")
for step in job_info.steps_completed:
print(f" ✓ {step}")
# Check completed jobs
completed_jobs = monitor.get_completed_jobs()
print(f"\n{'='*60}")
print(f"COMPLETED JOBS: {len(completed_jobs)}")
print(f"{'='*60}")
for job_id, job_info in completed_jobs.items():
status_icon = "✓" if job_info.status.value == "completed" else "✗"
print(f"{status_icon} {job_info.job_dir.name}: {job_info.status.value} ({job_info.elapsed_time:.0f}s)")
Monitoring Tips¶
Real-time monitoring:
- Use
scan_for_jobs(force=True)to bypass the 3-second scan interval - Check status every 1-2 seconds for responsive progress updates
- Use
flush=Truein print statements for immediate display
Long-running jobs:
- Check status periodically (every 30-60 seconds)
- Log progress to file for later review
- Use
elapsed_timeto estimate remaining time
Multiple jobs:
- Monitor entire working directory for batch processing
- Filter jobs by status for targeted monitoring
- Use job_dir.name for user-friendly display
Debugging:
- Check
job_dir / 'logs/preparation.log'for detailed output - Check
job_dir / 'logs/parametrization.log'if parametrization fails - Examine
status.jsonfor detailed job state
Manual log inspection:
# Follow preparation log in real-time
tail -f systems/popc_membrane/logs/preparation.log
# Check for errors
grep -i error systems/popc_membrane/logs/*.log
# View job status
cat systems/popc_membrane/status.json
Tips and Best Practices¶
Lipid Selection¶
For simple systems:
- Use 100% POPC for both leaflets (symmetric membrane)
- Well-characterized, stable, and widely used
- Good starting point for method development
For realistic membranes:
- Include cholesterol (10-30%) for membrane stability
- Use PE lipids (POPE) for inner leaflet (more physiological)
- Add PS lipids (POPS) for negative charge (cytoplasmic side)
For specialized studies:
- Use asymmetric compositions (different upper/lower)
- Match experimental lipid compositions when available
- Consider protein's native membrane environment
Force Field Selection¶
Default recommendation (most stable):
- Water: tip3p
- Protein: ff14SB
- Lipid: lipid21
For latest parameters:
- Water: opc
- Protein: ff19SB
- Lipid: lipid21
Salt Concentration¶
Physiological (default):
- 0.15 M (150 mM) NaCl or KCl
- Mimics intracellular or extracellular conditions
High salt studies:
- 0.5-1.0 M for ionic strength effects
- May require longer equilibration
Charge neutralization only:
- Set
salt_concentration=0.0 - System will still be neutralized with minimum ions
Water Layer Thickness¶
Default (17.5 Å):
- Sufficient for most membrane proteins
- ~3-4 water layers above/below membrane
Large proteins or complexes:
- Increase to 20-25 Å
- Ensures adequate solvation
Minimize system size:
- Decrease to 15 Å (minimum recommended)
- Faster simulations but ensure full solvation
Protein Orientation¶
Pre-oriented proteins:
- Use OPM database
- Protein already positioned in membrane plane
- Set
preoriented=True(default)
Non-oriented proteins:
- Let packmol-memgen use MEMEMBED for orientation
- Set
preoriented=False - Adds extra time to preparation
Troubleshooting¶
"PDB file not found":
- Check absolute/relative paths
- Verify file permissions
- Ensure PDB file is valid format
"Packing failed":
- Protein may be too large for default settings
- Try increasing
dist_watparameter - Check protein orientation (OPM database)
- Verify protein is pre-oriented if using
preoriented=True
"Parametrization failed":
- Check logs/parametrization.log for details
- May have non-standard residues
- Verify protonation states (use Propka workflow)
- Check for missing atoms or unusual residue names
"Invalid lipid ratios":
- Ensure ratios match number of lipids
- Format: "ratio1:ratio2//ratio3:ratio4"
- Ratios don't need to sum to 1.0 (auto-normalized)
- Example: "7:3//5:5" for 2 lipids per leaflet
"Force field compatibility error":
- Use ForceFieldManager to check combinations
- Not all force fields are compatible
- Use recommended combinations for stability
Performance Tips¶
Faster preparation:
- Use symmetric membranes (same upper/lower)
- Fewer lipid types (1-2 per leaflet)
- Pre-oriented proteins
- Smaller water layers (15-17.5 Å)
More realistic systems:
- Asymmetric membranes (different leaflets)
- Multiple lipid types (3-4 per leaflet)
- Include cholesterol
- Larger water layers (20-25 Å)
- Use Propka workflow for correct protonation
Ligand Parametrization¶
Module for detecting, extracting, and parametrizing non-standard (ligand) residues found in PDB files. Uses the AMBER/GAFF workflow (antechamber → parmchk2 → tleap) to generate force field parameters that integrate with the Builder's packmol-memgen and final tleap steps. Supports both GAFF and GAFF2 atom type sets, with recommended pairings per the AMBER manual: gaff/bcc and gaff2/abcg2.
Import¶
from gatewizard.tools.ligand_parametrization import (
detect_ligands,
extract_ligand_pdb,
parametrize_ligand,
parametrize_all_ligands,
get_ligand_2d_image,
get_ligand_2d_image_from_pdb_lines,
build_ligand_param_args,
build_tleap_ligand_lines,
LigandInfo,
LigandParametrizationError,
STANDARD_RESIDUES,
CHARGE_METHODS,
ATOM_TYPES,
DEFAULT_CHARGE_METHOD,
DEFAULT_ATOM_TYPE,
RECOMMENDED_COMBOS,
NON_RECOMMENDED_COMBOS,
)
Class: LigandInfo¶
Information container for a detected ligand residue.
| Attribute | Type | Description |
|---|---|---|
name | str | 3-letter residue name (e.g., "AAA") |
chain | str | Chain identifier |
res_id | int | Residue sequence number |
num_atoms | int | Number of atoms in the ligand |
elements | Dict[str, int] | Element counts (e.g., {'C': 9, 'H': 10, 'O': 1}) |
pdb_lines | List[str] | Raw HETATM lines from the PDB file |
formula | str (property) | Molecular formula string (e.g., "C9H10O") |
Methods:
| Method | Returns | Description |
|---|---|---|
to_dict() | Dict[str, Any] | Convert to serializable dictionary |
Constants¶
| Constant | Type | Description |
|---|---|---|
STANDARD_RESIDUES | set | Residue names excluded from detection (amino acids, water, ions, lipids, capping groups) |
CHARGE_METHODS | dict | Antechamber charge methods: 'bcc' (AM1-BCC), 'abcg2' (ABCG2), 'gas' (Gasteiger), 'mul' (Mulliken), 'cm2' (CM2), 'rc' (Read-in), 'resp' (RESP — requires Gaussian), 'esp' (ESP — requires Gaussian) |
ATOM_TYPES | dict | Antechamber atom type sets: 'gaff2' → 'GAFF2', 'gaff' → 'GAFF' |
DEFAULT_CHARGE_METHOD | str | 'bcc' |
DEFAULT_ATOM_TYPE | str | 'gaff2' |
RECOMMENDED_COMBOS | set | Recommended (atom_type, charge_method) pairings per AMBER manual: {('gaff', 'bcc'), ('gaff2', 'abcg2')} |
NON_RECOMMENDED_COMBOS | set | Non-recommended pairings (warning shown in GUI): {('gaff2', 'bcc'), ('gaff', 'abcg2')} |
Recommended Atom Type / Charge Method Pairings
Per the AMBER manual, the efficient charge models bcc (AM1-BCC) and abcg2 (ABCG2) should be paired with specific atom type sets:
| Atom Type | Charge Method | Status |
|---|---|---|
gaff | bcc | Recommended ✓ |
gaff2 | abcg2 | Recommended ✓ |
gaff2 | bcc | Not recommended ✗ |
gaff | abcg2 | Not recommended ✗ |
The GUI shows an amber warning label when a non-recommended combination is selected, and displays a confirmation dialog before proceeding with parametrization.
External QM Software Requirements
The resp and esp charge methods require Gaussian (external quantum-mechanics software, not included in AmberTools). All other methods (bcc, abcg2, gas, mul, cm2) use sqm, which is bundled with AmberTools.
Function: detect_ligands()¶
Detect non-standard (ligand) residues in a PDB file by scanning HETATM records.
Parameters:
| Parameter | Type | Description |
|---|---|---|
pdb_file | str | Path to the PDB file to analyze |
Returns: List[LigandInfo] — one entry per unique ligand residue name
Raises: LigandParametrizationError if the file cannot be read
Example 19: Detect Ligands¶
"""
Builder Example 19: Detect ligands in a PDB file
Demonstrates how to detect non-standard residues (ligands)
in a PDB file using the ligand parametrization tools.
"""
from gatewizard.tools.ligand_parametrization import detect_ligands
# Detect ligands in a PDB file with two ligands (AAA and BBB)
pdb_file = "tests/2MVJ_2ligs.pdb"
ligands = detect_ligands(pdb_file)
print(f"Detected {len(ligands)} ligand(s) in {pdb_file}:")
print(f"{'='*60}")
for lig in ligands:
print(f"\nLigand: {lig.name}")
print(f" Chain: {lig.chain}")
print(f" Residue ID: {lig.res_id}")
print(f" Number of atoms: {lig.num_atoms}")
print(f" Molecular formula: {lig.formula}")
print(f" Elements: {lig.elements}")
print(f" As dict: {lig.to_dict()}")
Function: extract_ligand_pdb()¶
Extract a single ligand from a PDB file into its own file.
Parameters:
| Parameter | Type | Description |
|---|---|---|
pdb_file | str | Path to the source PDB file |
ligand_name | str | 3-letter residue name of the ligand |
output_dir | str | Directory to write the extracted PDB file |
Returns: str — path to the extracted PDB file (output_dir/LIGAND.pdb)
Raises: LigandParametrizationError if the ligand is not found or extraction fails
Example 20: Extract a Ligand¶
"""
Builder Example 20: Extract a ligand from a PDB file
Demonstrates how to extract a specific ligand into its own PDB file
for individual parametrization.
"""
from pathlib import Path
from gatewizard.tools.ligand_parametrization import detect_ligands, extract_ligand_pdb
pdb_file = "tests/2MVJ_2ligs.pdb"
output_dir = "./systems/ligand_extraction"
# First detect ligands
ligands = detect_ligands(pdb_file)
print(f"Detected ligands: {[l.name for l in ligands]}")
# Extract each ligand to its own subdirectory
for lig in ligands:
lig_dir = str(Path(output_dir) / lig.name)
extracted_pdb = extract_ligand_pdb(pdb_file, lig.name, lig_dir)
# Verify extraction
with open(extracted_pdb) as f:
lines = [l for l in f if l.startswith("HETATM")]
print(f"\nExtracted {lig.name}:")
print(f" Output: {extracted_pdb}")
print(f" Atoms: {len(lines)}")
print(f" First line: {lines[0].strip()[:60]}...")
print(f"\nAll extracted ligands saved in: {output_dir}")
Function: parametrize_ligand()¶
Parametrize a single ligand using the AMBER/GAFF workflow (antechamber → parmchk2 → tleap).
parametrize_ligand(
ligand_pdb: str,
ligand_name: str,
output_dir: str,
charge: int = 0,
charge_method: str = 'bcc',
multiplicity: int = 1,
atom_type: str = 'gaff2',
) -> Dict[str, str]
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
ligand_pdb | str | — | Path to the ligand PDB file |
ligand_name | str | — | 3-letter residue name (must match PDB) |
output_dir | str | — | Directory for output files |
charge | int | 0 | Net charge of the ligand |
charge_method | str | 'bcc' | Charge method (see CHARGE_METHODS) |
multiplicity | int | 1 | Spin multiplicity |
atom_type | str | 'gaff2' | Atom type set — 'gaff2' or 'gaff' (see ATOM_TYPES) |
Returns: Dictionary with paths to generated files:
| Key | Description |
|---|---|
'mol2' | Typed MOL2 file (GAFF/GAFF2 atom types + charges) |
'frcmod' | Force field modification file (missing parameters) |
'lib' | Residue library file for tleap |
'prmtop' | AMBER topology file |
'inpcrd' | AMBER coordinate file |
Raises: LigandParametrizationError if any step fails
Critical: Variable Name Must Match Residue Name
The tleap variable name must match the residue name in the PDB. For example, AAA = loadmol2 AAA.mol2 — using a generic name like mol = loadmol2 AAA.mol2 will cause saveoff to store the unit under the wrong name and tleap will fail to recognize the residue when loading the full system PDB.
Function: parametrize_all_ligands()¶
Detect and parametrize all ligands in a PDB file in one call.
parametrize_all_ligands(
pdb_file: str,
output_dir: str,
charges: Optional[Dict[str, int]] = None,
charge_method: str = 'bcc',
atom_type: str = 'gaff2',
) -> Dict[str, Dict[str, str]]
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
pdb_file | str | — | PDB file containing ligands |
output_dir | str | — | Base directory (each ligand gets a subdirectory) |
charges | Dict[str, int] or None | None | Map of ligand name → net charge (default: 0 for all) |
charge_method | str | 'bcc' | Charge method for antechamber |
atom_type | str | 'gaff2' | Atom type set — 'gaff2' or 'gaff' (see ATOM_TYPES) |
Returns: Dict[str, Dict[str, str]] — maps ligand names to their file paths (same structure as parametrize_ligand())
Example 22: Parametrize All Ligands¶
"""
Builder Example 22: Parametrize all ligands in a PDB file
Demonstrates the full ligand parametrization workflow:
1. Detect ligands from PDB
2. Extract each ligand
3. Run antechamber + parmchk2 + tleap for each
4. Collect .frcmod and .lib files
NOTE: This example requires AmberTools (antechamber, parmchk2, tleap)
to be installed and accessible in the PATH.
"""
from pathlib import Path
from gatewizard.tools.ligand_parametrization import (
parametrize_all_ligands,
build_ligand_param_args,
build_tleap_ligand_lines,
)
pdb_file = "tests/2MVJ_2ligs.pdb"
output_dir = "./systems/ligand_params"
# Set charges for each ligand (default is 0 if not specified)
charges = {
'AAA': 0,
'BBB': 0,
}
print(f"Output directory: {output_dir}")
# Parametrize all ligands
results = parametrize_all_ligands(
pdb_file=pdb_file,
output_dir=output_dir,
charges=charges,
charge_method='bcc', # AM1-BCC charges
atom_type='gaff2', # GAFF2 atom types (recommended with abcg2; bcc also works)
)
print(f"\nParametrized {len(results)} ligand(s):")
for name, files in results.items():
print(f"\n {name}:")
for file_type, file_path in files.items():
exists = Path(file_path).exists()
print(f" {file_type}: {file_path} ({'exists' if exists else 'MISSING'})")
# Now these can be passed to Builder.prepare_system()
print("\n\npackmol-memgen arguments:")
pmm_args = build_ligand_param_args(results)
print(f" {' '.join(pmm_args)}")
print("\ntleap ligand lines:")
tleap_lines = build_tleap_ligand_lines(results)
print(tleap_lines)
print(f"\nAll parameter files saved in: {output_dir}")
Function: get_ligand_2d_image()¶
Generate a publication-quality 2D molecular structure image using RDKit.
get_ligand_2d_image(
ligand_pdb_or_mol2: str,
output_image: str,
width: int = 400,
height: int = 300,
*,
remove_nonpolar_h: bool = True,
remove_all_h: bool = False,
dpi: int = 150,
bond_line_width: float = 2.5,
atom_label_font_size: int = 0,
background_color: tuple = (0.11, 0.11, 0.11, 1.0),
padding: float = 0.15,
kekulize: bool = True,
wedge_bonds: bool = True,
atom_palette: dict | None = None,
highlight_atoms: list[int] | None = None,
highlight_color: tuple = (1.0, 0.8, 0.0, 0.3),
transparent_background: bool = False,
) -> Optional[str]
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
ligand_pdb_or_mol2 | str | — | Path to PDB or MOL2 file |
output_image | str | — | Path for the output PNG |
width | int | 400 | Image width in pixels (before DPI scaling) |
height | int | 300 | Image height in pixels (before DPI scaling) |
remove_nonpolar_h | bool | True | Remove non-polar hydrogens (C-H) for cleaner figures while keeping polar H (O-H, N-H, etc.) |
remove_all_h | bool | False | Remove all explicit hydrogens. Overrides remove_nonpolar_h |
dpi | int | 150 | DPI multiplier. Use 300 for print-quality images |
bond_line_width | float | 2.5 | Thickness of bond lines |
atom_label_font_size | int | 0 | Font size for atom labels (0 = auto) |
background_color | tuple | (0.11, 0.11, 0.11, 1.0) | RGBA background colour (0–1 range) |
padding | float | 0.15 | Fractional padding around the molecule (0–1) |
kekulize | bool | True | Draw aromatic bonds as alternating single/double |
wedge_bonds | bool | True | Draw stereo wedge/dash bonds |
atom_palette | dict | None | Custom {atomic_number: (r, g, b)} colour map. Uses built-in dark palette when None |
highlight_atoms | list[int] | None | 0-based atom indices to highlight |
highlight_color | tuple | (1.0, 0.8, 0.0, 0.3) | RGBA colour for highlights |
transparent_background | bool | False | Produce a transparent PNG |
Returns: Path to the generated PNG image, or None if generation failed
Built-in palettes:
| Palette | Import | Best for |
|---|---|---|
_DEFAULT_DARK_PALETTE | (used automatically) | Dark backgrounds |
LIGHT_PALETTE | from gatewizard.tools.ligand_parametrization import LIGHT_PALETTE | White / light backgrounds |
Function: get_ligand_2d_image_from_pdb_lines()¶
Generate a 2D image directly from PDB HETATM lines (no file required). All keyword arguments are forwarded to get_ligand_2d_image.
get_ligand_2d_image_from_pdb_lines(
pdb_lines: List[str],
output_image: str,
width: int = 400,
height: int = 300,
**kwargs,
) -> Optional[str]
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
pdb_lines | List[str] | — | HETATM lines for the ligand |
output_image | str | — | Path for the output PNG |
width | int | 400 | Image width |
height | int | 300 | Image height |
**kwargs | All keyword options from get_ligand_2d_image |
Returns: Path to the generated PNG image, or None if failed
Example 24: Generate 2D Structure Images¶
"""
Builder Example 24: Generate 2D structure images of ligands
Demonstrates generating 2D molecular structure images from PDB files
using RDKit, including options for hydrogen removal, DPI control,
custom colour palettes, and transparent backgrounds.
"""
from pathlib import Path
from gatewizard.tools.ligand_parametrization import (
detect_ligands,
get_ligand_2d_image_from_pdb_lines,
get_ligand_2d_image,
extract_ligand_pdb,
LIGHT_PALETTE,
)
pdb_file = "tests/2MVJ_2ligs.pdb"
output_dir = "./systems/ligand_images"
Path(output_dir).mkdir(parents=True, exist_ok=True)
# Detect ligands
ligands = detect_ligands(pdb_file)
print(f"Detected {len(ligands)} ligands\n")
for lig in ligands:
# --- Style 1: Default (dark background, non-polar H removed) --------
img = str(Path(output_dir) / f"{lig.name}_default.png")
result = get_ligand_2d_image_from_pdb_lines(
lig.pdb_lines, img, width=400, height=300,
remove_nonpolar_h=True, # cleaner look (default)
)
if result:
print(f"{lig.name} default : {Path(result).stat().st_size:>6} bytes")
# --- Style 2: All hydrogens removed ----------------------------------
img = str(Path(output_dir) / f"{lig.name}_no_h.png")
result = get_ligand_2d_image_from_pdb_lines(
lig.pdb_lines, img, width=400, height=300,
remove_all_h=True, # skeleton only
)
if result:
print(f"{lig.name} no-H : {Path(result).stat().st_size:>6} bytes")
# --- Style 3: High-DPI for publication (white background) ------------
img = str(Path(output_dir) / f"{lig.name}_hires.png")
lig_dir = str(Path(output_dir) / lig.name)
extracted_pdb = extract_ligand_pdb(pdb_file, lig.name, lig_dir)
result = get_ligand_2d_image(
extracted_pdb, img,
width=800, height=600,
dpi=300, # high DPI
remove_all_h=True,
background_color=(1, 1, 1, 1), # white
atom_palette=LIGHT_PALETTE, # colours for light background
bond_line_width=1.5,
)
if result:
print(f"{lig.name} hi-res : {Path(result).stat().st_size:>6} bytes")
# --- Style 4: Transparent background ---------------------------------
img = str(Path(output_dir) / f"{lig.name}_transparent.png")
result = get_ligand_2d_image_from_pdb_lines(
lig.pdb_lines, img, width=400, height=300,
remove_nonpolar_h=True,
transparent_background=True,
)
if result:
print(f"{lig.name} transp. : {Path(result).stat().st_size:>6} bytes")
# --- Style 5: All hydrogens visible, thicker bonds -------------------
img = str(Path(output_dir) / f"{lig.name}_all_h.png")
result = get_ligand_2d_image_from_pdb_lines(
lig.pdb_lines, img, width=500, height=400,
remove_nonpolar_h=False,
remove_all_h=False,
bond_line_width=3.5,
padding=0.2,
)
if result:
print(f"{lig.name} all-H : {Path(result).stat().st_size:>6} bytes")
print()
print(f"All images saved in: {output_dir}")
Function: build_ligand_param_args()¶
Build --ligand_param command-line arguments for packmol-memgen.
Parameters:
| Parameter | Type | Description |
|---|---|---|
ligand_files | Dict[str, Dict[str, str]] | Ligand name → file paths (as returned by parametrize_ligand) |
Returns: List of CLI arguments — each ligand produces ['--ligand_param', 'path.frcmod:path.lib']
One Flag Per Ligand
packmol-memgen requires a separate --ligand_param flag per ligand. Combining multiple ligands into a single flag will not work.
Function: build_tleap_ligand_lines()¶
Build tleap input lines to load GAFF/GAFF2 and ligand parameters. Insert these before loadPDB.
build_tleap_ligand_lines(
ligand_files: Dict[str, Dict[str, str]],
atom_type: str = 'gaff2',
) -> str
Parameters:
| Parameter | Type | Description |
|---|---|---|
ligand_files | Dict[str, Dict[str, str]] | Ligand name → file paths |
atom_type | str | Atom type set — 'gaff2' or 'gaff' (default: 'gaff2'). Determines the leaprc source line |
Returns: Multi-line string with tleap commands (source leaprc.gaff2 or source leaprc.gaff, loadamberparams, loadoff)
Example 21: Build packmol-memgen and tleap Arguments¶
"""
Builder Example 21: Build packmol-memgen and tleap commands with ligand parameters
Demonstrates how to construct the packmol-memgen --ligand_param arguments
and tleap input lines for systems with ligands.
"""
from gatewizard.tools.ligand_parametrization import (
build_ligand_param_args,
build_tleap_ligand_lines,
)
# Simulated parametrization results (paths to .frcmod and .lib files)
ligand_files = {
'AAA': {
'frcmod': 'ligand_params/AAA/AAA.frcmod',
'lib': 'ligand_params/AAA/AAA.lib',
'mol2': 'ligand_params/AAA/AAA.mol2',
},
'BBB': {
'frcmod': 'ligand_params/BBB/BBB.frcmod',
'lib': 'ligand_params/BBB/BBB.lib',
'mol2': 'ligand_params/BBB/BBB.mol2',
},
}
# Build packmol-memgen arguments
# Each ligand gets its own --ligand_param flag (CANNOT combine in one flag)
pmm_args = build_ligand_param_args(ligand_files)
print("packmol-memgen arguments:")
for i in range(0, len(pmm_args), 2):
print(f" {pmm_args[i]} {pmm_args[i+1]}")
print()
# Build tleap input lines
# These go BEFORE loadPDB in the tleap input
tleap_lines = build_tleap_ligand_lines(ligand_files)
print("tleap input lines:")
print(tleap_lines)
# With GAFF atom types instead of GAFF2
tleap_lines_gaff = build_tleap_ligand_lines(ligand_files, atom_type='gaff')
print("\ntleap input lines (GAFF):")
print(tleap_lines_gaff)
Builder Integration: ligand_params Config Key¶
When ligands are parametrized (either through the GUI or programmatically), the Builder uses the ligand_params configuration key to integrate them into the build process.
Config Structure:
config['ligand_params'] = {
'AAA': {
'frcmod': '/path/to/AAA/AAA.frcmod',
'lib': '/path/to/AAA/AAA.lib',
},
'BBB': {
'frcmod': '/path/to/BBB/BBB.frcmod',
'lib': '/path/to/BBB/BBB.lib',
},
}
What happens during build:
- packmol-memgen receives
--ligand_param frcmod:libfor each ligand, plus--gaff2(or--gaffdepending on the atom type used) - tleap parametrization loads
source leaprc.gaff2(orsource leaprc.gaff), thenloadamberparamsandloadofffor each ligand beforeloadPDB - The atom type is automatically extracted from the parametrized ligand results, so matching
leaprcis always used
Example 23: Full Membrane System with Ligand Parametrization¶
"""
Builder Example 23: Full membrane system with ligand parametrization
Demonstrates setting up a complete membrane system that includes
non-standard ligands. The ligand .frcmod/.lib files are passed
to both packmol-memgen and the final tleap parametrization.
NOTE: This example requires AmberTools and packmol-memgen.
"""
from gatewizard.core.builder import Builder
from gatewizard.tools.ligand_parametrization import (
detect_ligands,
parametrize_all_ligands,
)
# Create builder
builder = Builder()
pdb_file = "tests/2MVJ_2ligs.pdb"
working_dir = "./systems"
# Step 1: Detect and parametrize ligands
print("Step 1: Detecting ligands...")
ligands = detect_ligands(pdb_file)
for lig in ligands:
print(f" Found: {lig.name} ({lig.num_atoms} atoms, {lig.formula})")
print("\nStep 2: Parametrizing ligands...")
ligand_results = parametrize_all_ligands(
pdb_file=pdb_file,
output_dir=f"{working_dir}/ligand_params",
charges={'AAA': 0, 'BBB': 0},
charge_method='bcc',
atom_type='gaff2', # GAFF2 atom types
)
print(f" Parametrized: {list(ligand_results.keys())}")
# Step 3: Configure builder with ligand parameters
builder.set_configuration(
water_model='tip3p',
protein_ff='ff14SB',
lipid_ff='lipid21',
preoriented=True,
parametrize=True,
salt_concentration=0.15,
dist_wat=17.5,
notprotonate=True,
ligand_params=ligand_results, # Pass parametrized ligand files
)
print("\nStep 3: Builder configured with ligand parameters")
print(f" Ligands in config: {list(builder.config['ligand_params'].keys())}")
# Step 4: Prepare system
success, message, job_dir = builder.prepare_system(
pdb_file=pdb_file,
working_dir=working_dir,
upper_lipids=['POPC'],
lower_lipids=['POPC'],
lipid_ratios='1.0//1.0',
)
print(f"\nResult: {message}")
# The builder will:
# 1. Add --ligand_param AAA/AAA.frcmod:AAA/AAA.lib to packmol-memgen
# 2. Add --ligand_param BBB/BBB.frcmod:BBB/BBB.lib to packmol-memgen
# 3. Add --gaff2 to packmol-memgen
# 4. Load ligand .frcmod and .lib in the tleap parametrization step
print("\nWorkflow complete. The builder will pass ligand params to:")
print(" - packmol-memgen (--ligand_param flags)")
print(" - tleap (loadamberparams/loadoff commands)")
Example 26: Atom Type Selection and Recommended Pairings¶
"""
Builder Example 26: Atom type selection and recommended pairings
Demonstrates how to choose between GAFF and GAFF2 atom types,
check recommended pairings per the AMBER manual, and use
the ABCG2 charge method with GAFF2.
NOTE: This example requires AmberTools (antechamber, parmchk2, tleap)
to be installed and accessible in the PATH.
"""
from gatewizard.tools.ligand_parametrization import (
parametrize_all_ligands,
build_tleap_ligand_lines,
ATOM_TYPES,
CHARGE_METHODS,
DEFAULT_ATOM_TYPE,
DEFAULT_CHARGE_METHOD,
RECOMMENDED_COMBOS,
NON_RECOMMENDED_COMBOS,
)
# ── Available options ────────────────────────────────────────────────
print("Available atom types:")
for key, label in ATOM_TYPES.items():
default = " (default)" if key == DEFAULT_ATOM_TYPE else ""
print(f" {key}: {label}{default}")
print("\nAvailable charge methods:")
for key, label in CHARGE_METHODS.items():
default = " (default)" if key == DEFAULT_CHARGE_METHOD else ""
print(f" {key}: {label}{default}")
# ── Recommended pairings ─────────────────────────────────────────────
print("\nRecommended pairings (AMBER manual):")
for at, cm in sorted(RECOMMENDED_COMBOS):
print(f" {at} + {cm} ✓")
print("\nNon-recommended pairings (will show warning):")
for at, cm in sorted(NON_RECOMMENDED_COMBOS):
print(f" {at} + {cm} ✗")
# ── Check a pairing before parametrizing ─────────────────────────────
atom_type = 'gaff2'
charge_method = 'abcg2'
if (atom_type, charge_method) in RECOMMENDED_COMBOS:
print(f"\n{atom_type}/{charge_method} is a recommended pairing.")
elif (atom_type, charge_method) in NON_RECOMMENDED_COMBOS:
print(f"\nWARNING: {atom_type}/{charge_method} is NOT recommended.")
else:
print(f"\n{atom_type}/{charge_method} has no specific recommendation.")
# ── Parametrize with gaff2/abcg2 (recommended) ──────────────────────
pdb_file = "tests/2MVJ_2ligs.pdb"
output_dir = "./systems/ligand_params_gaff2_abcg2"
results = parametrize_all_ligands(
pdb_file=pdb_file,
output_dir=output_dir,
charges={'AAA': 0, 'BBB': 0},
charge_method='abcg2', # ABCG2 charges
atom_type='gaff2', # GAFF2 atom types (recommended with abcg2)
)
print(f"\nParametrized {len(results)} ligand(s) with gaff2/abcg2:")
for name, files in results.items():
print(f" {name}: {files.get('frcmod', 'N/A')}")
# ── tleap lines reflect the chosen atom type ─────────────────────────
tleap_gaff2 = build_tleap_ligand_lines(results, atom_type='gaff2')
print(f"\ntleap lines (GAFF2):\n{tleap_gaff2}")
tleap_gaff = build_tleap_ligand_lines(results, atom_type='gaff')
print(f"\ntleap lines (GAFF):\n{tleap_gaff}")
Ligand Parametrization Troubleshooting¶
"Antechamber failed":
- Check that AMBER/AmberTools is installed and in
$PATH - Verify ligand PDB has proper HETATM records
- Try a different charge method (e.g.,
gasis faster and may work whenbccfails) - Check
logs/antechamber.logfor detailed error messages
"No ligands detected":
- Ligands must use HETATM records, not ATOM
- Residue names must NOT be in
STANDARD_RESIDUESset - Water (HOH, WAT), ions (NA, CL), and lipids (POPC, etc.) are excluded by design
"tleap: unknown residue name":
- The tleap variable name must match the 3-letter residue name exactly
- Ensure
.libfile was generated with the correct residue name insaveoff - Check that
source leaprc.gaff2(orsource leaprc.gaff) is loaded beforeloadamberparams
"Non-recommended combination" warning (gaff2/bcc or gaff/abcg2):
- The AMBER manual recommends
gaff/bccandgaff2/abcg2pairings - Using
gaff2/bccorgaff/abcg2may produce suboptimal parameters - The GUI shows an amber warning label and a confirmation dialog
- To silence the warning, switch to a recommended pairing
- Check
RECOMMENDED_COMBOSandNON_RECOMMENDED_COMBOSconstants for the full list
"resp/esp: Gaussian not found":
- The
respandespcharge methods require Gaussian (external QM software) - Gaussian is not included in AmberTools
- Use
bccorabcg2instead (both use the built-insqmengine)
"RDKit not available":
- Install with
conda install -c conda-forge rdkit - RDKit is a required dependency for ligand 2D visualization
See Also¶
- Preparation Module - Protonation state analysis
- Equilibration Module - MD equilibration protocols
- User Guide - Builder Tab - GUI workflow
- Examples on GitHub - Complete workflow examples