Skip to content

VOSpace

🎯 VOSpace Guide Overview

Master CANFAR's long-term storage system:

  • VOSpace Concepts: Understanding IVOA standards and when to use Vault
  • Web Interface: Browser-based file management and sharing
  • Command-Line Tools: Efficient bulk operations and automation
  • Python API: Programmatic access for workflows and integration
  • Metadata & Sharing: Rich data descriptions and collaborative access

VOSpace is CANFAR's implementation of the International Virtual Observatory Alliance (IVOA) VOSpace standard, providing long-term, secure, and collaborative storage for astronomy data. It serves as both an archive and a data sharing platform.

🌐 VOSpace Overview

What is VOSpace?

VOSpace is a distributed storage service that allows astronomers to:

  • Store data persistently with geographic redundancy
  • Share data with collaborators and the public
  • Organize data with hierarchical directories and metadata
  • Access data programmatically via standardized APIs
  • Integrate with Virtual Observatory tools and services

Vault VOSpace vs ARC VOSpace vs Scratch

Feature Vault ARC Projects ARC Home Scratch
Persistence ✅ Permanent ✅ Permanent ✅ Permanent ❌ Session only
Backup ✅ Geo-redundant ⚠️ Basic ⚠️ Basic ❌ None
Sharing ✅ Flexible permissions ⚠️ Group-based ⚠️ User-based ❌ Session only
Public access ✅ Public URLs ❌ Private ❌ Private ❌ Session only
Metadata ✅ Rich metadata ⚠️ Basic ⚠️ Basic ❌ None
API access ✅ VOSpace API ✅ VOSpace API ✅ VOSpace API ❌ None
Speed Slow (network) Medium (network) Medium (network) Fast (SSD)

🌍 Web Interface

Accessing VOSpace

  1. Navigate to:
  2. Login: Use your CADC credentials
  3. Browse: Navigate through your space and shared spaces

Web Interface Features

File Operations

  • Upload: Drag and drop or click "Add" → "Upload Files"
  • Download: Select files → "Download" (ZIP, URL list, or HTML list)
  • Create folders: "Add" → "Create Folder"
  • Delete: Select items → "Delete"
  • Move/Copy: Drag and drop or cut/paste

Sharing and Permissions

Right-click file/folder → Properties → Permissions

Permission Types:
- Read (r): View and download
- Write (w): Modify and delete  
- Execute (x): Navigate directories

Target Groups:
- Owner: You (full control)
- Group: Project members
- Other: Public access

💻 Command Line Interface

Installation

VOSpace tools are pre-installed in CANFAR sessions in CANFAR-maintained containers such as astroml. For local or custom installation, use pip:

# Install VOS python module with vcp/vsync/vls/vchmod/vmkdir commands
pip install vos

# Verify installation
vls --help
vcp --help

Authentication

# Get security certificate (valid 24 hours)
cadc-get-cert -u [user]

# Verify authentication
vls vos:[user]

Basic Operations

Directory Operations

# List directories and files
vls vos:[user]/                    # Your root directory
vls vos:[user]/projects/           # Subdirectory
vls -l vos:[user]/data/            # Detailed listing

# Create directories
vmkdir vos:[user]/new_project/
vmkdir vos:[user]/data/2024/

# Navigate hierarchically
vls vos:[user]/projects/survey/data/

File Operations

# Upload files
vcp mydata.fits vos:[user]/data/
vcp *.fits vos:[user]/observations/
# vcp is recursive
vcp ./analysis_scripts/ vos:[user]/code/ 

# Download files  
vcp vos:[user]/data/results.fits ./
vcp "vos:[user]/observations/*.fits" ./data/
vcp vos:[user]/code/ ./local_scripts/

# Copy between VOSpace locations
vcp vos:[user]/data/obs1.fits vos:[user]/backup/

File Management

# Move/rename files
vmv vos:[user]/old_name.fits vos:[user]/new_name.fits
vmv vos:[user]/temp/ vos:[user]/archive/

# Delete files and directories
vrm vos:[user]/old_file.fits
vrm vos:[user]/old_directory/

# View file contents (for text files)
vcat vos:[user]/catalog.csv

Advanced Operations

Bulk Operations

# Synchronise directories
vsync ./local_data/ vos:[user]/backup/
vsync vos:[user]/analysis/ ./local_analysis/

# Parallel transfers for speed
vsync --nstreams=4 huge_dataset.tar vos:[user]/archives/

Permission Management

# Make file publicly readable
vchmod o+r vos:[user]/public_catalog.fits

# Grant group access
vchmod g+rw vos:[user]/shared_data.fits

# Set permissions for specific groups
vchmod g+r:external-collaborators vos:[user]/collaboration_data/

# View current permissions
vls -l vos:[user]/myfile.fits

Data Cutouts and Processing

# FITS cutouts (pixel coordinates)
vcp "vos:[user]/image.fits[100:200,100:200]" ./cutout.fits

# Header-only download
vcp --head vos:[user]/large_image.fits ./headers.txt

# Inspect headers without downloading
vcat --head vos:[user]/observation.fits

🐍 Python API

Basic Setup

import vos
from vos import Client

# Initialize client (uses existing authentication)
client = Client()

# Alternative: specify authentication
client = Client(username='[user]', password='[password]')

File Operations

# List directory contents
files = client.listdir('vos:[user]/')
print(f"Found {len(files)} files")

# Check if file exists
exists = client.isfile('vos:[user]/data.fits')
if not exists:
    print("File not found")

# Get file information
info = client.get_info('vos:[user]/data.fits')
print(f"Size: {info['size']} bytes")
print(f"Modified: {info['date']}")

# Copy files
client.copy('mydata.fits', 'vos:[user]/uploads/mydata.fits')
client.copy('vos:[user]/results.txt', './local_results.txt')

# Create directories
client.mkdir('vos:[user]/new_project/')

# Delete files
client.delete('vos:[user]/old_file.fits')

Advanced Python Usage

Batch Processing

import os
from pathlib import Path

def process_vospace_directory(vospace_path, local_temp_dir):
    """Download, process, and re-upload files from VOSpace"""

    # Create local working directory
    Path(local_temp_dir).mkdir(exist_ok=True)

    # List files in VOSpace
    files = client.listdir(vospace_path)
    fits_files = [f for f in files if f.endswith('.fits')]

    for fits_file in fits_files:
        vospace_file = f"{vospace_path}/{fits_file}"
        local_file = f"{local_temp_dir}/{fits_file}"
        processed_file = f"{local_temp_dir}/processed_{fits_file}"

        # Download
        print(f"Downloading {fits_file}")
        client.copy(vospace_file, local_file)

        # Process (example: your analysis here)
        process_fits_file(local_file, processed_file)

        # Upload processed version
        processed_vospace = f"{vospace_path}/processed_{fits_file}"
        client.copy(processed_file, processed_vospace)

        # Cleanup local files
        os.remove(local_file)
        os.remove(processed_file)

# Usage
process_vospace_directory('vos:[user]/raw_data', './temp_processing')

Metadata Management

# Get file node (for metadata operations)
node = client.get_node('vos:[user]/observation.fits')

# Set metadata
node.props['TELESCOPE'] = 'ALMA'
node.props['OBJECT'] = 'NGC1365'
node.props['DATE-OBS'] = '2024-03-15T10:30:00'

# Update node with new metadata
client.update(node)

# Read metadata
props = node.props
telescope = props.get('TELESCOPE', 'Unknown')
object_name = props.get('OBJECT', 'Unknown')

print(f"Observation of {object_name} with {telescope}")

Progress Monitoring

def upload_with_progress(local_file, vospace_path):
    """Upload file with progress monitoring"""

    file_size = os.path.getsize(local_file)

    def progress_callback(bytes_transferred):
        percent = (bytes_transferred / file_size) * 100
        print(f"\rProgress: {percent:.1f}% ({bytes_transferred}/{file_size} bytes)", end='')

    try:
        client.copy(local_file, vospace_path, callback=progress_callback)
        print("\nUpload completed successfully!")
    except Exception as e:
        print(f"\nUpload failed: {e}")

# Usage
upload_with_progress('large_dataset.fits', 'vos:[user]/archives/dataset.fits')

🔒 Sharing and Collaboration

Permission Levels

Owner Permissions

  • Full control: Read, write, delete, change permissions
  • Default: Only owner has access to new files

Group Permissions

  • Read: Group members can view and download
  • Write: Group members can modify and upload
  • Execute: Group members can navigate directories

Public Permissions

  • Read: Anyone with the URL can download
  • Useful for: Publishing datasets, sharing with external collaborators

Setting Up Sharing

Command Line Sharing

# Make dataset publicly available
vchmod o+r vos:[user]/public_datasets/gaia_subset.fits

# Share with research group
vchmod g+rw:my_research_group vos:[user]/shared_analysis/

# Create public directory
vmkdir vos:[user]/public/
vchmod o+r vos:[user]/public/

# Share specific project data
vchmod g+r:external_collaborators vos:[user]/collaboration/survey_data/

Public URLs

# Files with public read permissions get accessible URLs:
# https://ws-cadc.canfar.net/vault/nodes/[user]/public_file.fits

# Direct download links for shared data:
curl -O https://ws-cadc.canfar.net/vault/nodes/[user]/public/catalog.csv

Collaboration Workflows

Multi-Institutional Project

# Project coordinator sets up shared space
vmkdir vos:[project]/data
vmkdir vos:[project]/public
vmkdir vos:[project]/results
vchmod g+rw:all_collaborators vos:[project]/data/

# Collaborators contribute data
vcp local_observations.fits vos:[project]/data/institution_a/
vcp analysis_results.csv vos:[project]/results/

# Public data release
vcp vos:[project]/data/final_catalogue.fits vos:[project]/public/
vchmod o+r vos:[project]/public/final_catalogue.fits

Data Publication

import vos

def publish_dataset(local_files, publication_space):
    """Publish dataset with proper metadata"""

    client = vos.Client()

    # Create publication directory
    client.mkdir(publication_space)

    for local_file in local_files:
        filename = os.path.basename(local_file)
        vospace_path = f"{publication_space}/{filename}"

        # Upload file
        client.copy(local_file, vospace_path)

        # Set metadata
        node = client.get_node(vospace_path)
        node.props['AUTHOR'] = 'Dr. Astronomer'
        node.props['PUBLICATION'] = 'ApJ 2024, 123, 456'
        node.props['DOI'] = '10.1088/example'
        client.update(node)

        # Make publicly accessible
        client.set_permissions(vospace_path, public_read=True)

        print(f"Published: {vospace_path}")

# Usage
files_to_publish = ['final_catalog.fits', 'processed_images.tar.gz']
publish_dataset(files_to_publish, 'vos:[user]/publications/survey2024')

🔧 Integration with Astronomical Tools

FITS File Handling

from astropy.io import fits
import tempfile
import os

def analyze_vospace_fits(vospace_path):
    """Analyze FITS file stored in VOSpace"""

    # Download to temporary file
    with tempfile.NamedTemporaryFile(suffix='.fits', delete=False) as tmp:
        client.copy(vospace_path, tmp.name)

        # Open with astropy
        with fits.open(tmp.name) as hdul:
            header = hdul[0].header
            data = hdul[0].data

            # Perform analysis
            mean_value = data.mean()
            max_value = data.max()

            print(f"Image stats: mean={mean_value:.2f}, max={max_value:.2f}")

            # Extract key information
            telescope = header.get('TELESCOP', 'Unknown')
            object_name = header.get('OBJECT', 'Unknown')

        # Cleanup
        os.unlink(tmp.name)

    return {'mean': mean_value, 'max': max_value, 'telescope': telescope}

# Usage
stats = analyze_vospace_fits('vos:[user]/observations/ngc1365.fits')

Integration with Archives

def mirror_archive_data(archive_url, vospace_destination):
    """Download from astronomical archive and store in VOSpace"""

    import requests
    import tempfile

    # Download from archive
    response = requests.get(archive_url)

    # Save to temporary file
    with tempfile.NamedTemporaryFile(delete=False) as tmp:
        tmp.write(response.content)
        tmp_path = tmp.name

    try:
        # Upload to VOSpace
        client.copy(tmp_path, vospace_destination)

        # Set metadata about source
        node = client.get_node(vospace_destination)
        node.props['ARCHIVE_URL'] = archive_url
        node.props['DOWNLOAD_DATE'] = datetime.now().isoformat()
        client.update(node)

        print(f"Mirrored {archive_url} to {vospace_destination}")

    finally:
        os.unlink(tmp_path)

# Example: Mirror HST data
mirror_archive_data(
    'https://archive.stsci.edu/missions/hubble/...',
    'vos:[user]/hst_data/observation_123.fits'
)

📊 Performance and Optimization

Transfer Performance

Caching and Local Mirrors

import hashlib
from pathlib import Path

class VOSpaceCache:
    def __init__(self, cache_dir='./vospace_cache'):
        self.cache_dir = Path(cache_dir)
        self.cache_dir.mkdir(exist_ok=True)
        self.client = vos.Client()

    def get_cached_file(self, vospace_path, force_refresh=False):
        """Get file from cache or download if needed"""

        # Generate cache filename
        cache_name = hashlib.md5(vospace_path.encode()).hexdigest()
        cache_file = self.cache_dir / cache_name

        # Check if cache is valid
        if not force_refresh and cache_file.exists():
            # Compare modification times
            local_mtime = cache_file.stat().st_mtime
            try:
                remote_info = self.client.get_info(vospace_path)
                remote_mtime = remote_info['date']

                if local_mtime >= remote_mtime:
                    print(f"Using cached version: {cache_file}")
                    return str(cache_file)
            except:
                pass

        # Download fresh copy
        print(f"Downloading {vospace_path} to cache")
        self.client.copy(vospace_path, str(cache_file))
        return str(cache_file)

# Usage
cache = VOSpaceCache()
local_file = cache.get_cached_file('vos:[user]/large_catalog.fits')

Monitoring and Logging

import logging
import time

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def monitored_transfer(source, destination):
    """Transfer with monitoring and timing"""

    start_time = time.time()
    logger.info(f"Starting transfer: {source}{destination}")

    try:
        client.copy(source, destination)

        end_time = time.time()
        duration = end_time - start_time

        # Get file size for speed calculation
        if source.startswith('vos:'):
            info = client.get_info(source)
            size_mb = info['size'] / (1024 * 1024)
        else:
            size_mb = os.path.getsize(source) / (1024 * 1024)

        speed = size_mb / duration if duration > 0 else 0

        logger.info(f"Transfer completed: {size_mb:.1f} MB in {duration:.1f}s ({speed:.1f} MB/s)")
        return True

    except Exception as e:
        logger.error(f"Transfer failed: {e}")
        return False

# Usage
success = monitored_transfer('large_file.fits', 'vos:[user]/archives/')

🛠️ Troubleshooting

Common Issues

Authentication Problems

# Certificate expired
cadc-get-cert -u [user]

# Check certificate validity
cadc-get-cert --days-valid

# Clear certificate cache
rm ~/.ssl/cadcproxy.pem
cadc-get-cert -u [user]

Permission Errors

# Check file permissions
vls -l vos:[user]/file.fits

# Verify directory permissions
vls -l vos:[user]/

# Check group membership
# (Contact CANFAR support if needed)

Network and Transfer Issues

# Test connectivity
ping ws-cadc.canfar.net

# Check VOSpace service status
vls vos:

# Retry with different parameters
vcp --timeout=3600 large_file.fits vos:[user]/  # Increase timeout
vcp --nstreams=1 problematic_file.fits vos:[user]/  # Reduce streams

Debugging and Diagnostics

import vos
import logging

# Enable debug logging
logging.basicConfig(level=logging.DEBUG)

# Get detailed client information
client = vos.Client()
print(f"VOSpace endpoint: {client.vospace_url}")
print(f"Authentication: {client.get_auth()}")

# Test basic operations
try:
    files = client.listdir('vos:')
    print(f"Root access successful, found {len(files)} items")
except Exception as e:
    print(f"Root access failed: {e}")

# Check specific paths
test_paths = ['vos:[user]/', 'vos:[user]/data/']
for path in test_paths:
    try:
        contents = client.listdir(path)
        print(f"✓ {path}: {len(contents)} items")
    except Exception as e:
        print(f"✗ {path}: {e}")

🔗 Next Steps

  • Data Transfers → - Moving data between storage systems
  • Filesystem Access → - ARC storage and SSHFS mounting
  • Storage Overview → - Understanding all CANFAR storage types
  • Interactive Sessions → - Using VOSpace within CANFAR sessions
    #### ARC (Inside CANFAR session)
    
    ```bash
    # List files and directories
    ls /arc/projects/[project]/
    
    # Copy files
    cp mydata.fits /arc/projects/[project]/data/
    
    # Create directories
    mkdir /arc/projects/[project]/survey_analysis/
    
    # Move/rename files
    mv /arc/projects/[project]/old.fits /arc/projects/[project]/new.fits
    
    # Remove files
    rm /arc/projects/[project]/temp/old_data.fits
    

Bulk Operations

Note: vsync and vcp are always recursive; no --recursive flag is needed.

Vault (VOSpace API)

# Sync entire directories to Vault
vsync ./local_data/ vos:[user]/backup/

# Download project data from Vault
vsync vos:[project]/survey_data/ ./project_data/

# Upload analysis results to Vault
vsync ./results/ vos:[user]/analysis_outputs/

ARC (VOSpace API, outside CANFAR)

# Sync entire directories to ARC
vsync ./local_data/ arc:projects/[project]/backup/

# Download project data from ARC
vsync arc:projects/[project]/survey_data/ ./project_data/

# Upload analysis results to ARC
vsync ./results/ arc:projects/[project]/analysis_outputs/

Python API

Basic Usage

import vos

# Initialize client
client = vos.Client()


# List directory contents in Vault
files_vault = client.listdir("vos:[user]/")
print(files_vault)

# List directory contents in ARC
files_arc = client.listdir("arc:projects/[project]/")
print(files_arc)

# Check if file exists in Vault
exists_vault = client.isfile("vos:[user]/data.fits")

# Check if file exists in ARC
exists_arc = client.isfile("arc:projects/[project]/data.fits")

# Get file info from Vault
info_vault = client.get_info("vos:[user]/data.fits")
print(f"Size: {info_vault['size']} bytes")
print(f"Modified: {info_vault['date']}")

# Get file info from ARC
info_arc = client.get_info("arc:projects/[project]/data.fits")
print(f"Size: {info_arc['size']} bytes")
print(f"Modified: {info_arc['date']}")

File Operations

# Copy file to Vault
client.copy("mydata.fits", "vos:[user]/data/mydata.fits")

# Copy file to ARC
client.copy("mydata.fits", "arc:projects/[project]/data/mydata.fits")

# Copy file from Vault
client.copy("vos:[user]/data/results.txt", "./results.txt")

# Copy file from ARC
client.copy("arc:projects/[project]/data/results.txt", "./results.txt")

# Create directory in Vault
client.mkdir("vos:[user]/new_project/")

# Create directory in ARC
client.mkdir("arc:projects/[project]/new_project/")

# Delete file in Vault
client.delete("vos:[user]/temp/old_file.txt")

# Delete file in ARC
client.delete("arc:projects/[project]/temp/old_file.txt")

Advanced Operations

import os
from astropy.io import fits

def process_fits_files(vospace_dir, output_dir):
    """Process all FITS files in a Vault or ARC directory"""

    # List all FITS files
    files = client.listdir(vospace_dir)
    fits_files = [f for f in files if f.endswith(".fits")]

    for fits_file in fits_files:
        vospace_path = f"{vospace_dir}/{fits_file}"
        local_path = f"./temp_{fits_file}"

        # Download file
        client.copy(vospace_path, local_path)

        # Process with astropy
        with fits.open(local_path) as hdul:
            # Your processing here
            processed_data = hdul[0].data * 2  # Example processing

            # Save processed file
            output_path = f"{output_dir}/processed_{fits_file}"
            fits.writeto(output_path, processed_data, overwrite=True)


            # Upload to Vault or ARC
            if vospace_dir.startswith("vos:"):
                client.copy(output_path, f"vos:[user]/processed/{fits_file}")
            else:
                client.copy(output_path, f"arc:projects/[project]/processed/{fits_file}")

        # Clean up temporary file
        os.remove(local_path)

# Usage

process_fits_files("vos:[user]/raw_data", "./processed/")
process_fits_files("arc:projects/[project]/raw_data", "./processed/")

Automation Workflows

Batch Processing Script

#!/usr/bin/env python3
"""
Automated data processing pipeline using Vault (VOSpace API) and ARC
"""
import vos
import sys
import logging
from pathlib import Path

# Setup logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def setup_vospace():
    """Initialize VOSpace client with authentication"""
    try:
        client = vos.Client()
        # Test connection
        client.listdir("vos:[project]/")
        return client
    except Exception as e:
        logger.error(f"VOSpace authentication failed: {e}")
        sys.exit(1)

def sync_input_data(client, remote_dir, local_dir):
    """Download input data from Vault or ARC"""
    logger.info(f"Syncing {remote_dir} to {local_dir}")

    Path(local_dir).mkdir(parents=True, exist_ok=True)

    # Get list of files
    files = client.listdir(remote_dir)

    for file in files:
        if file.endswith((".fits", ".txt", ".csv")):
            remote_path = f"{remote_dir}/{file}"
            local_path = f"{local_dir}/{file}"

            if not Path(local_path).exists():
                logger.info(f"Downloading {file}")
                client.copy(remote_path, local_path)

def upload_results(client, local_dir, remote_dir):
    """Upload processing results to Vault or ARC"""
    logger.info(f"Uploading results from {local_dir} to {remote_dir}")

    # Ensure remote directory exists
    try:
        client.mkdir(remote_dir)
    except:
        pass  # Directory might already exist

    for file_path in Path(local_dir).glob("*"):
        if file_path.is_file():
            remote_path = f"{remote_dir}/{file_path.name}"
            logger.info(f"Uploading {file_path.name}")
            client.copy(str(file_path), remote_path)

def main():
    """Main processing pipeline"""
    client = setup_vospace()

    # Configuration
    input_remote_vault = "vos:[project]/raw_data"
    input_remote_arc = "arc:projects/[project]/raw_data"
    output_remote_vault = "vos:[user]/processed_results"
    output_remote_arc = "arc:projects/[project]/processed_results"
    local_input = "./input_data"
    local_output = "./output_data"

    # Download input data from Vault
    sync_input_data(client, input_remote_vault, local_input)
    # Download input data from ARC
    sync_input_data(client, input_remote_arc, local_input)

    # Your processing code here
    logger.info("Processing data...")
    # ... processing logic ...

    # Upload results to Vault
    upload_results(client, local_output, output_remote_vault)
    # Upload results to ARC
    upload_results(client, local_output, output_remote_arc)

    logger.info("Pipeline completed successfully")

if __name__ == "__main__":
    main()

Monitoring and Logging

Transfer Progress

def copy_with_progress(client, source, destination):
    """Copy file with progress monitoring"""
    import time

    # Start transfer
    start_time = time.time()
    client.copy(source, destination)
    end_time = time.time()

    # Get file size for speed calculation
    if source.startswith("vos:"):
        info = client.get_info(source)
        size_mb = info["size"] / (1024 * 1024)
    else:
        size_mb = os.path.getsize(source) / (1024 * 1024)

    duration = end_time - start_time
    speed = size_mb / duration if duration > 0 else 0

    print(f"Transfer completed: {size_mb:.1f} MB in {duration:.1f}s ({speed:.1f} MB/s)")

Error Handling

def robust_copy(client, source, destination, max_retries=3):
    """Copy with retry logic"""
    import time

    for attempt in range(max_retries):
        try:
            client.copy(source, destination)
            return True
        except Exception as e:
            logger.warning(f"Copy attempt {attempt + 1} failed: {e}")
            if attempt < max_retries - 1:
                time.sleep(2**attempt)  # Exponential backoff
            else:
                logger.error(f"Copy failed after {max_retries} attempts")
                return False

Performance Optimization

Parallel Transfers

import concurrent.futures
import threading


def parallel_upload(client, file_list, remote_dir, max_workers=4):
    """Upload multiple files in parallel"""

    def upload_file(file_path):
        remote_path = f"{remote_dir}/{file_path.name}"
        try:
            client.copy(str(file_path), remote_path)
            return f"✓ {file_path.name}"
        except Exception as e:
            return f"✗ {file_path.name}: {e}"

    with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = [executor.submit(upload_file, f) for f in file_list]

        for future in concurrent.futures.as_completed(futures):
            result = future.result()
            print(result)

Caching Strategy

import hashlib
from pathlib import Path


def cached_download(client, vospace_path, local_path, force_refresh=False):
    """Download file only if it has changed"""

    local_file = Path(local_path)
    cache_file = Path(f"{local_path}.cache_info")

    # Get remote file info
    remote_info = client.get_info(vospace_path)
    remote_hash = remote_info.get("MD5", "")

    # Check if we have cached info
    if not force_refresh and local_file.exists() and cache_file.exists():
        cached_hash = cache_file.read_text().strip()
        if cached_hash == remote_hash:
            print(f"Using cached version of {local_file.name}")
            return local_path

    # Download file
    print(f"Downloading {local_file.name}")
    client.copy(vospace_path, local_path)

    # Save cache info
    cache_file.write_text(remote_hash)

    return local_path

Integration Examples

With Astropy

from astropy.io import fits
from astropy.table import Table


def analyze_vospace_catalog(client, catalog_path):
    """Analyze a catalog stored in VOSpace"""

    # Download catalog
    local_path = "./temp_catalog.fits"
    client.copy(catalog_path, local_path)

    # Load and analyze
    table = Table.read(local_path)

    # Example analysis
    bright_sources = table[table["magnitude"] < 15]
    print(f"Found {len(bright_sources)} bright sources")

    # Save filtered results
    result_path = "./bright_sources.fits"
    bright_sources.write(result_path, overwrite=True)

    # Upload results
    result_vospace = catalog_path.replace(".fits", "_bright.fits")
    client.copy(result_path, result_vospace)

    # Cleanup
    os.remove(local_path)
    os.remove(result_path)

With Batch Jobs

#!/bin/bash
# Batch job script using Vault and ARC via VOSpace API

# Authenticate
cadc-get-cert --cert ~/.ssl/cadcproxy.pem


# Download input data from Vault
vcp vos:[project]/input/data.fits ./input.fits
# Download input data from ARC
vcp arc:projects/[project]/input/data.fits ./input_arc.fits

# Process data
python analysis_script.py input.fits output.fits

# Upload results to Vault
vcp output.fits vos:[project]/results/processed_$(date +%Y%m%d).fits
# Upload results to ARC
vcp output.fits arc:projects/[project]/results/processed_$(date +%Y%m%d).fits

# Cleanup
rm input.fits input_arc.fits output.fits

Troubleshooting

Common Issues

Authentication Problems:

# Refresh certificate
cadc-get-cert --cert ~/.ssl/cadcproxy.pem

# Check certificate validity
cadc-get-cert --cert ~/.ssl/cadcproxy.pem --days-valid

Network Timeouts:

# Increase timeout for large files
import vos

client = vos.Client()
client.timeout = 300  # 5 minutes

Permission Errors:

# Check file permissions in Vault
vls -l vos:[user]/file.fits
# Check file permissions in ARC
vls -l arc:home/[user]/script.py

# Check directory access in Vault
vls vos:[project]/
# Check directory access in ARC
vls arc:projects/[project]/