# Setting up a Replica Server (Stratum 1)¶

While a CernVM-FS Stratum 0 repository server is able to serve clients directly, a large number of clients is better be served by a set of Stratum 1 replica servers. Multiple Stratum 1 servers improve the reliability, reduce the load, and protect the Stratum 0 master copy of the repository from direct accesses. Stratum 0 server, Stratum 1 servers and the site-local proxy servers can be seen as content distribution network. The figure below shows the situation for the repositories hosted in the cern.ch domain.

CernVM-FS content distribution network for the cern.ch domain: Stratum1 replica servers are located in Europe, the U.S. and Asia. One protected read/write instance (Stratum 0) is feeding up the public, distributed mirror servers. A distributed hierarchy of proxy servers fetches content from the closest public mirror server.

A Stratum 1 server is a standard web server that uses the CernVM-FS server toolkit to create and maintain a mirror of a CernVM-FS repository served by a Stratum 0 server. To this end, the cvmfs_server utility provides the add-replica command. This command will register the Stratum 0 URL and prepare the local web server. Periodical synchronization has to be scheduled, for instance with cron, using the cvmfs_server snapshot -a command. The advantage over general purpose mirroring tools such as rSync is that all CernVM-FS file integrity verifications mechanisms from the Fuse client are reused. Additionally, by the aid of the CernVM-FS file catalogs, the cvmfs_server utility knows beforehand (without remote listing) which files to transfer.

In order to prevent accidental synchronization from a repository, the Stratum 0 repository maintainer has to create a .cvmfs_master_replica file in the HTTP root directory. This file is created by default when a new repository is created. Note that replication can thrash caches that might exist between Stratum 1 and Stratum 0. A direct connection is therefore preferable.

## Squid Configuration¶

The Squid configuration differs from the site-local Squids because the Stratum 1 Squid servers are transparent to the clients (reverse proxy). As the expiry rules are set by the web server, Squid cache expiry rules remain unchanged.

The following lines should appear accordingly in /etc/squid/squid.conf:

http_port 80 accel
http_port 8000 accel
http_access allow all
cache_peer <APACHE_HOSTNAME> parent <APACHE_PORT> 0 no-query originserver

cache_mem <MEM_CACHE_SIZE> MB
cache_dir ufs /var/spool/squid <DISK_CACHE_SIZE in MB> 16 256
maximum_object_size 1024 MB
maximum_object_size_in_memory 128 KB


Note that http_access allow all has to be inserted before (or instead of) the line http_access deny all. If Apache is running on the same host, the APACHE_HOSTNAME will be localhost. Also, in that case there is not a performance advantage for squid to cache files that came from the same machine, so you can configure squid to not cache files. Do that with the following lines:
acl CVMFSAPI urlpath_regex ^/cvmfs/[^/]*/api/
cache deny !CVMFSAPI


Then the squid will only cache API calls. You can then set MEM_CACHE_SIZE and DISK_CACHE_SIZE quite small.

Check the configuration syntax by squid -k parse. Create the hard disk cache area with squid -z. In order to make the increased number of file descriptors effective for Squid, execute ulimit -n 8192 prior to starting the squid service.

## Geo API Setup¶

One of the essential services supplied by Stratum 1s to CernVM-FS clients is the Geo API. This enables clients to share configurations worldwide while automatically sorting Stratum 1s geographically to prioritize connecting to the closest ones. This makes use of a GeoIP database from Maxmind that translates IP addresses of clients to longitude and latitude.

The database is free, but the Maxmind End User License Agreement requires that each user of the database sign up for an account and promise to update the database to the latest version within 30 days of when they issue a new version. The signup process will end with giving you a License Key. The cvmfs_server add-replica and snapshot commands will take care of automatically updating the database if you put a line like the following in /etc/cvmfs/server.local, replacing <license key> with the key you get from the signup process:

CVMFS_GEO_LICENSE_KEY=<license key>


To keep the key secret, set the mode of /etc/cvmfs/server.local to 600.

Alternatively, if you have a separate mechanism of installing and updating the Geolite2 City database file, you can instead set CVMFS_GEO_DB_FILE to the full path where you have installed it. If the path is NONE, then no database will be required, but note that this will break the client Geo API so only use it for testing, when the server is not used by production clients. If the database is installed in the default path used by Maxmind’s own geoipupdate tool, /usr/share/GeoIP, then cvmfs_server will use it from there and neither variable needs to be set.

Normally repositories on Stratum 1s are created owned by root, and the cvmfs_server snapshot command is run by root. If you want to use a different user id while still using the builtin mechanism for updating the geo database, change the owner of /var/lib/cvmfs-server/geo and /etc/cvmfs/server.local to the user id.

The builtin geo database update mechanism normally checks for updates once a week on Tuesdays but can be controlled through a set of variables defined in cvmfs_server beginning with CVMFS_UPDATEGEO_. Look in the cvmfs_server script for the details. An update can also be forced at any time by running cvmfs_server update-geodb.

## Monitoring¶

The cvmfs_server utility reports status and problems to stdout and stderr.

For the web server infrastructure, we recommend standard Nagios HTTP checks. They should be configured with the URL http://$replica-server/cvmfs/$repository_name/.cvmfspublished. This file can also be used to monitor if the same repository revision is served by the Stratum 0 server and all the Stratum 1 servers. In order to tune the hardware and cache sizes, keep an eye on the Squid server’s CPU and I/O load.

Keep an eye on HTTP 404 errors. For normal CernVM-FS traffic, such failures should not occur. Traffic from CernVM-FS clients is marked by an X-CVMFS2 header.