1.6 Working with Data

Working with Data

Now that you’re connected, you’ll want to get your data in and understand where things are stored.

Storage Types

Root Volume (EBS)

Every workspace includes a root volume for the operating system and your working data.

Aspect Details
Type Amazon EBS (Elastic Block Store) SSD
Default Size 50 GB
Maximum Size 500 GB (configurable at creation)
Persistence Data persists when workspace is stopped
Deletion Data is deleted when workspace is terminated

Studies (S3-backed Storage)

Studies provide persistent data storage that lives independently of your workspaces. Your data survives workspace termination.

Type Access Use Case
Personal Only you Your own datasets and working files
Shared Multiple users in your project Collaborative datasets
Other Varies by configuration Externally managed or cross-project data
Permission What You Can Do
Read-only View and download data
Read-write View, download, upload, and modify data
Note

Study access and configuration may vary based on your project setup. Contact your administrator for details on available studies.

Warning

Important: Data on your workspace root volume is deleted when you terminate the workspace. Always back up important data before terminating — or use studies for anything you want to keep.

Moving Data In and Out

SCP (Secure Copy)

Transfer files from your local computer using the command line.

# Upload a single file
scp -i ~/.ssh/your-key.pem localfile.txt ec2-user@<workspace-ip>:/home/ec2-user/

# Upload a directory
scp -i ~/.ssh/your-key.pem -r local-folder/ ec2-user@<workspace-ip>:/home/ec2-user/

# Download a file
scp -i ~/.ssh/your-key.pem ec2-user@<workspace-ip>:/home/ec2-user/results.txt ./

SFTP (Graphical Transfer)

Use an SFTP client like FileZilla, WinSCP, or Cyberduck.

Setting Value
Protocol SFTP
Host Your workspace IP
Port 22
Username ec2-user
Authentication Private key file

Web IDE Upload

  1. Connect to your workspace via Open IDE
  2. Right-click in the file explorer
  3. Select Upload
  4. Choose files from your computer

AWS CLI (for S3)

# Download from S3
aws s3 cp s3://bucket-name/file.txt /home/ec2-user/

# Upload to S3
aws s3 cp /home/ec2-user/results.txt s3://bucket-name/

# Sync a directory
aws s3 sync /home/ec2-user/data/ s3://bucket-name/data/

Organizing Your Data

A recommended directory structure:

/home/ec2-user/
├── data/           # Input datasets
├── scripts/        # Analysis scripts
├── results/        # Output files
├── logs/           # Log files
└── temp/           # Temporary files (can be deleted)

Managing Storage Space

# Check available space
df -h

# Check directory sizes
du -sh /home/ec2-user/*

# Find large files
find /home/ec2-user -size +100M -type f

Backup Recommendations

Data Type Backup Frequency Method
Code and scripts After each session Git repository
Results After each analysis SCP/SFTP to local
Large datasets As needed S3 bucket or study

Now that your data is in place, learn how to manage your workspace day-to-day: Managing Your Workspace