Cost Analysis

The Hidden Cost of Genomics File Transfer

A single whole genome sequence generates 300 GB of data. A cohort study with 100 samples produces 30 TB. At these scales, file transfer costs become a significant line item that most labs underestimate.

The Data Scale Problem in Genomics

Modern sequencing generates massive amounts of data. A single whole genome sequence (WGS) at 30x coverage produces approximately 100-300 GB of raw data. Whole exome sequencing (WES) is more manageable at 10-20 GB per sample, but most studies involve hundreds or thousands of samples.

When genomics data needs to move between sequencing facilities, cloud compute environments, collaborating institutions, or CRO partners, the transfer costs add up quickly. Most budgets account for sequencing costs but overlook the expense of moving data afterward.

Real-World Cost Scenarios

The following table compares actual costs for common genomics transfer scenarios across different methods:

ScenarioSizeCloud EgressPer-GB ServiceHandrive
Single WGS Sample300 GB$27$75$0
Cohort Study (100 samples)30 TB$2,700$7,500$0
Longitudinal Research (1,000 samples)300 TB$21,000$75,000$0
Large Biobank (10,000 samples)3 PB$168,000$750,000$0

Cloud egress calculated at AWS S3 standard rates (~$0.09/GB for first 10TB). Per-GB service assumes $0.25/GB download pricing.

Breaking Down the Cost Components

Cloud Storage Egress

If your sequencing data lives in AWS, GCP, or Azure, you pay egress fees every time data leaves. AWS S3 charges $0.09/GB for the first 10 TB, with tiered discounts for higher volumes. For a 30 TB cohort study, expect to pay $2,500-3,000 per transfer.

The hidden multiplier: genomics workflows often require multiple transfers. Raw FASTQ files go to alignment, BAM files go to variant calling, VCF files go to downstream analysis. Each hop between cloud regions or providers incurs additional egress.

Enterprise File Transfer Tools

Enterprise tools like Aspera offer high-speed UDP transfer that overcomes TCP limitations on high-latency links. The trade-off is cost: annual licenses start around $10,000 and scale up based on throughput and features. For institutions with consistent high-volume needs, the fixed cost can be reasonable. For smaller labs or project-based work, the license fee is hard to justify.

Per-GB Transfer Services

Pay-per-GB services charge download fees, typically $0.20-0.30 per GB. For occasional small transfers, this is convenient. At genomics scale, it becomes prohibitive. A 30 TB dataset at $0.25/GB costs $7,500 for a single transfer.

The Total Cost of a Typical Study

Consider a multi-site clinical genomics study with 500 WGS samples (150 TB total). The data needs to move from sequencing facility to cloud, from cloud to analysis partners, and final results back to the coordinating center.

Multi-Site Study: 500 WGS Samples (150 TB)

  • Sequencing facility → Cloud: $10,500 (cloud ingress free, facility egress varies)
  • Cloud → Analysis Partner A: $10,500
  • Cloud → Analysis Partner B: $10,500
  • Results back to coordinator: $500 (compressed results)
  • Total cloud egress: ~$32,000

Using per-GB services instead would cost approximately $112,500 for the same data movement.

Why P2P Eliminates These Costs

Handrive uses direct peer-to-peer transfer with no intermediate servers. Data flows directly from source to destination, encrypted end-to-end. Since there is no cloud relay, there are no egress fees and no per-GB charges.

For genomics workflows, this means:

  • Sequencing facilities can deliver data directly to research institutions without paying for upload services
  • Research labs can share datasets with collaborators without egress fees
  • CRO partnerships can exchange data without either party absorbing transfer costs
  • Multi-site studies can move data between sites without budget constraints on data access

What About HIPAA?

For clinical genomics data, compliance matters. P2P architecture is HIPAA-friendly because data never resides on third-party servers. There is no Business Associate Agreement needed when no business associate handles the data. E2E encryption ensures data confidentiality during transit.

Calculator: What Are You Really Paying?

Quick Cost Estimate

Your dataset size: _____ TB

Number of transfers per year: _____

Current method: Cloud egress / Per-GB service


Cloud egress (@ $0.07/GB average): Size × 1,024 × $0.07 × transfers

Per-GB service (@ $0.25/GB): Size × 1,024 × $0.25 × transfers

Handrive: $0 regardless of volume

For a detailed breakdown of transfer costs at petabyte scale, see our petabyte transfer cost guide.

Getting Started

Handrive is free to download and use. Install it on your workstation, NAS, or Linux server. For always-on availability (important for large transfers that run overnight), set up headless mode on a dedicated machine.

The transfer uses a UDP-based protocol that achieves full bandwidth utilization regardless of network latency. A 30 TB dataset transfers in approximately 7 hours on a 10 Gbps connection, or 3 days on a 1 Gbps connection.


Related Posts

Stop Paying Per-GB for Genomic Data

Download Handrive and transfer sequencing data at no cost. E2E encrypted. No file size limits. No cloud relay.

Download Handrive