Rolling your own offsite internet based backup?

Stereodude

Not really a
Joined
Jan 22, 2002
Messages
10,865
Location
Michigan
Last year I had the idea to build a remote rsync target. It never saw the light of day. :(

I never spent the time to get it fully working mainly due to the complexity of my desire to use whole disc encryption in Linux which meant the system needed a password fed to it in order to decrypt the volume boot. To accomplish this the system needed Early-SSH which seemed fairly complicated to setup and get working and I gave up.

I want to revisit the concept, but I now have a few more potential problems to overcome. I'm not set on any OS, software, or even method.

My Requirements:
  • Remote target will not be powered all the time (for power savings).
  • The remote target will be connected to the internet via DSL (PPPoE). This means it's IP is constantly changing and the connection is not constant, but "on-demand".
  • Data over the internet be encrypted, but I'm flexible on the implementation. Setting up a VPN from the remote device to my network is okay, using software that uses SSL/SSH is okay, etc.
  • intelligent differential copying (like rsync)
  • Local end runs Windows XP Pro x64
Preferences:
  • Run as service / not require log-in on remote side
  • Data on the remote system be encrypted, but I realize this is likely to create an even bigger problem than before since the system won't run 24/7 and the encryption would need the password every night.
Possible Software & issues:
  • Unison - not secure unless using SSH (meaning cygwin on XP box) or VPN / no inherent file encryption
  • rsync - requires cygwin on XP box which is a bit messy / no inherent file encryption
  • Syncrify - remote system should be the server and data is pushed, so how to shut down remote server when local client is finished?

Right now I'm leaning toward installing XP Pro + Syncrify on the remote system if I can come up with a good solution on how to shut it down when the backup is finished since it has integrated encryption (done on local / client side).

And in case anyone was curious how I plan to start the remote system regularly, I plan to use WoL to power up the machine. It will be triggered by ether-wake running as a scheduled task in the Tomato packin' Linksys WRT-54GL on the other end.

Any ideas / tips / suggestions? (other than signing up for Carbonite, or similar)
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,614
Location
Horsens, Denmark
With the server changing IP address and being unavailable, the server will need to initiate the connection.

I would probably have the server power up on a timer, automatically connect via VPN, and then start pulling the data from a shared folder. I would make that shared folder contain a pre-compressed/encrypted copy of your stuff.

1. Before server scheduled connection time, workstation performs local backup/encryption to a folder. Only this folder is available to the server.
2. On a timer, server powers up and connects via VPN, pulling the contents of the share over.

Advantage: Server, however compromised, cannot get to non-encrypted data.
Disadvantage: I don't think differential copying would be possible, as the server can't tell the difference.
 

Stereodude

Not really a
Joined
Jan 22, 2002
Messages
10,865
Location
Michigan
That plan sounds like a disaster since you would have to upload a new full backup every night which is totally not feasible. Also, I don't know if you've ever tried to compress a pile of files and encrypt them before, but it's not a fast process.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,614
Location
Horsens, Denmark
That plan sounds like a disaster since you would have to upload a new full backup every night which is totally not feasible. Also, I don't know if you've ever tried to compress a pile of files and encrypt them before, but it's not a fast process.

The only way it would work is to make the backup a differential, then compress/encrypt that for upload. I know the compression/encryption would be slow, that I why you would do it in advance.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,614
Location
Horsens, Denmark
On the workstation. Even MS Backup would be able to do it.

Have the backups run on the workstation, then take the daily diffs, encrypt them, and stick them in the share to be copied.

There is no way to have the server do the diffs if you don't want the server to have knowledge of the data.
 

Stereodude

Not really a
Joined
Jan 22, 2002
Messages
10,865
Location
Michigan
With the server changing IP address and being unavailable, the server will need to initiate the connection.
I'm not sure this is actually true. WoL triggered from the remote router can be used to wake the remote box. The remote box can power up and create the on demand PPPoE connection. The remote router can be set to automatically register it's WAN side IP with dyndns or similar. Once this has happened the local side device could initiate contact with the remote box. Obviously there would need to be some sort of delay, but that's not too hard to take care of. The remote box would wake at say 2:00AM and the local box tries to connect to the remote box at say 2:05AM.

After sleeping on it, I'm thinking the remote box should be a Linux box. A secure SSH tunnel can be created automatically without any passwords (using keys) and being initiated from the local side. SSH is basically a terminal, so my local box could mount a remote truecrypt volume (or similar) providing the password remotely over SSH. The mounted volume could then be used by rsync. That would keep the data encypted, and if the machine were stolen the data is secure since the password to decrypt is not on it, and there's no way for the remote box to get the password. In addition because contact is initiated from my local side if someone were to get their hands on the machine and log into it they wouldn't be able to exploit the stored keys for SSH to gain access to my local system since they would have no idea what PC they're for or how to connect to it.
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
21,809
Location
I am omnipresent
I have a setup where one of my machines at home logs in to a remote machine via SSH and copies whatever stuff TrueImage put on its removable drive that day.

Those backups are fairly small, maybe 250MB/day. The copy still takes a half hour. I wouldn't want to try to pull gigabytes of crap between two consumer internet connections.
 

Stereodude

Not really a
Joined
Jan 22, 2002
Messages
10,865
Location
Michigan
I have a setup where one of my machines at home logs in to a remote machine via SSH and copies whatever stuff TrueImage put on its removable drive that day.

Those backups are fairly small, maybe 250MB/day. The copy still takes a half hour. I wouldn't want to try to pull gigabytes of crap between two consumer internet connections.
I don't plan to move gigabytes of stuff. The initial backup would be done with the remote system on my LAN. I would expect the nightly backup to be fairly small (<100MB/night). I don't plan to back up everything. Mostly just documents & photos. No audio or video files.

If I need to restore I can physically go get the box and bring it back. I'm strongly hoping to never need it. If I need it, I've had a huge disaster (like my house burning down) since I've got several backups of the data on site already.
 

Sol

Storage is cool
Joined
Feb 10, 2002
Messages
960
Location
Cardiff (Wales)
If you already plan to have a router running tomato at the remote end then you could just configure the router to do dynamic dns and have the client kick off the WoL (probably via ssh and a script/shortcut) when it needs the remote system.

You could get the router to do some or all of the encryption as well via a vpn which would make the remote server setup much simpler. It would also mean you could consider using a USB drive attached to the router or a home/office NAS box instead of a full PC which might save on power and/or be faster than booting a PC with WoL for every backup.
 

Stereodude

Not really a
Joined
Jan 22, 2002
Messages
10,865
Location
Michigan
Some of that would be possible in theory. The problem is there are other devices plugged into the router that need unfettered internet access, so there's some limit to what I can do, since I don't want a constant VPN connection between the routers or similar strategies that might otherwise be appealing if the remote box was the only thing sitting there.
 

Howell

Storage? I am Storage!
Joined
Feb 24, 2003
Messages
4,740
Location
Chattanooga, TN
How does a constant VPN tunnel fetter access to the other devices?

In a pinch you could setup and tear down the VPN connection on a schedule and kick off the WOL on a schedule.
 

Howell

Storage? I am Storage!
Joined
Feb 24, 2003
Messages
4,740
Location
Chattanooga, TN
I don't want all their traffic being routed through my connection then to theirs because they're effectively on my network via the VPN.

You control that with routing rules. I would be surprised if Tomato does not facilitate split-tunneling.
 

Sol

Storage is cool
Joined
Feb 10, 2002
Messages
960
Location
Cardiff (Wales)
I don't want all their traffic being routed through my connection then to theirs because they're effectively on my network via the VPN.

I was suggesting that the client machine on your local network would connect via VPN to the remote router. So the client machine itself would be on the remote network (Or actually it would be on the VPN network and the remote router would route between the VPN network and the remote network), but no other machines on the local network would (unless they also had VPN access in order to back stuff up) and no machines on the remote network would have access to machines on the local network (Other than the client itself and then only if it made services available on the VPN interface).

Conceptually the network would look something like the attached image. The remote router would have routing rules which enable communication between the 10.0.8.x network and the 192.168.1.x network but as the client machine would not do any routing there is no way to get from the 192.168.1.x network to the 192.168.0.x network or vice versa. The client machines default route would always be 192.168.0.1 so only traffic destined for the 10.0.8.x or 192.168.1.x networks would ever pass over the VPN. (x.x.x.x and y.y.y.y are the external IP addresses of the routers in case that wasn't obvious)

I'm pretty sure this is roughly what you would get by default with OpenVpn on Tomato.
 

blakerwry

Storage? I am Storage!
Joined
Oct 12, 2002
Messages
4,203
Location
Kansas City, USA
Website
justblake.com
I've ran into the same problem as you have with full disk encryption (basically key encryption) and haven't found a good solution w/ regards to automation. You either trust the system to have an unencrypted key or you don't. If you don't, then you have to manually enter a pass-phrase.

If you don't trust the network and you don't trust the remote storage, then I definitely agree to encrypting before you send. Something like pgp/gpg and SyncBack Freeware or even a simple batch copy or ftp should be fine. Push or pull is fine.

If you encrypt beforehand, then rsync has no inherent advantage, and VPN/SSH/HTTPS is redundant. Just make sure that if it's a pull, that the remote system only has access to the encrypted data, and not the full drive.
 
Top