5 Configuration Directive Reference
5.1 Generic Directives
GlobuleAdminURL url [ password ]
Set the internal reference and configuration location. Globule requires a
path by which the internals of Globule can be reached. This is used by
Globule to contact itself and to provide monitoring information. Any location
which addresses an available space of your web-server is valid. The URL must
end with a slash and must by a fully qualified path, including protocol
GlobuleAdmURL is an alias for GlobuleAdminURL.
For use with the Globule Broker System (GBS) at http://www.globeworld.net/. The GlobuleBrokerConfigurationSerial directive is automatically generated by the GBS system to indicate the last time this configuration file has been generated. It should not be changed or removed from the configuration if present. There is no use for specifying this directive by hand. Globule itself only stores the date value specified, but does not interpret it. The value is returned by the page as specified by GlobuleAdminURL with appended path /gbs.
GlobuleFancyServerName "A fancy name for this machine"
For this reason there are three possible type of values this directive recommended to give this directive:
If not specified, defaults to the ServerName in effect. This directive can be specified at a global level and later overridden on a per globule imported or exported section.
GlobuleMonitor item filter options
Deprecated and ignored use GlobuleDebugProfile instead.
GlobuleDebugProfile [ defaults | normal | extended | verbose ]
Specifies how verbose the error reporting should be by Globule. The ``defaults'' setting is now equivalent to the setting ``normal'', this default setting does not output much information about its workings and logs only serious error messages. You can change this by selecting one of the standard profiles.
Note that even though the setting may be set to ``verbose'', the messages with log-level ``informational'', etcetera may still be suppressed by a LogLevel directive as used by Apache. For instance, if you set your ErrorLevel to error, you will not see most messages which would have been outputted by the extended profile. Also remember that with Apache, you should specify the ErrorLevel before the ErrorLog directive.
Instructs Globule to use a shared memory segment of the instructed size instead of the default. The size is in bytes, but may be followed with a denomination as specified in 5.7, such as in:
GlobuleMemSize "8 mb"
Which specifies 8 megabyte, or 8388608 bytes, which is also the default.
Many Linux/Unix operating systems do not allow large quantities of shared memory to be allocated. Instructions on how to check your current limits, and increase them if necessary, are available in Section 6.2.1.
5.2 Replication Directives
<Location /path> GlobuleReplicate [ on | off ] [ GlobuleReplicaIs url secret ] [ GlobuleBackupIs url secret ] [ GlobuleRedirectorIs url secret ] </Location>
The GlobuleReplicate directive is used in an origin server to define which part of the site must be copied to replica, backup or redirector servers. It must be contained within a standard Apache <Location>, <Directory> or <Files> environment. It specifies that all the contents which match that environment should (or should not) be replicated to replica servers, overriding an earlier-defined parent location specification. This way you can turn off replication for a sub-location, and turn it on again for a sub-sub-location. Also replication can be turned off for files matching a given pattern, using the Apache <Files> and <FilesMatch> environment.
<Location /> GlobuleReplicate on GlobuleReplicaIs ... GlobuleBackupIs ... GlobuleRedirectorIs ... <Location /cgi-bin/> GlobuleReplicate off </Location> <Files *.pdf> GlobuleReplicate off </Files> </Location>
In a hierarchy of <Location> with GlobuleReplicate set on and off, only the parent definition should contain GlobuleReplicaIs, GlobuleBackupIs and/or GlobuleRedirectorIs directives. They are defined for the whole section of locations. You cannot overload them in sub-locations.
Note that a location definition using <Location /export> may mean something different from <Location /export/>. The latter is a directory specification, which is probably what you want.
GlobuleReplicaIs http://replica.full.domain/ p4ssw0rd1 GlobuleReplicaIs http://replica.full.domain:8333/replicapath/ p4ssw0rd2
GlobuleReplicaIs is used at origin servers to specify which servers should act as a replica for the site. For each specified replica server, it also defines a password which is used for mutual authentication between the origin and the replica. Note that it is in general not a good idea to use the same password for several origin-replica pairs.
Replica servers listed on the origin server must add a corresponding GlobuleReplicaFor directive. The origin's GlobuleReplicaIs directive and the corresponding replica's GlobuleReplicaFor directive must mention the same password, otherwise mutual authentication will fail.
The URL mentioned in GlobuleReplicaIs is a concatenation of the replica server's fully qualified hostname, port number (if different from 80) and its import path as defined in the replica's GlobuleReplicaFor definition. Fully qualified hostnames are mandatory. The path must end in a slash, as a whole directory is normally exported.
GlobuleBackupIs http://replica.full.domain/ p4ssw0rd1 GlobuleBackupIs http://replica.full.domain:8333/replicapath/ p4ssw0rd2
GlobuleBackupIs is similar to the GlobuleReplicaIs directive, except that is is used at an origin server to define backups of the site (rather than to define its replicas). It also takes the same arguments.
Backup servers listed on the origin server must add a corresponding GlobuleBackupFor directive. The origin's GlobuleBackupIs directive and the corresponding backup's GlobuleBackupFor directive must mention the same password, otherwise mutual authentication will fail.
In addition, all replicas of the site must add a GlobuleBackupForIs directive in their configuration. When they need a fresh copy of a document, if the origin server is unreachable, then they will retrieve it from one of the backups specified that way.
GlobuleRedirectorIs http://redir.full.domain/ p4ssw0rd1 GlobuleRedirectorIs http://redir.full.domain:8333/redirpath/ p4ssw0rd2
GlobuleRedirectorIs is used at an origin server to specify one or more stand-alone redirectors for the site. Using one or more redirector(s) external to the origin server is useful to keep the site running even though the origin may be down.
If no redirector is specified for a given origin, then the origin server will automatically act as its own redirector.
GlobuleRedirectorIs takes the same arguments as a GlobuleReplicaIs directive.
Redirector servers listed at an origin server must add a corresponding GlobuleRedirectorFor directive. The origin's GlobuleRedirectorIs directive and the corresponding backup's GlobuleRedirectorFor directive must mention the same password, otherwise mutual authentication will fail.
<Location /path> GlobuleReplicaFor origin-url secret [ GlobuleBackupForIs origin-url backup-url ] [ GlobuleBackupForIs origin-url 2nd-backup-url ] </Location>
The GlobuleReplicaFor directive is used to configure a replica server. This directive must be used within a <Location> or VirtualServer container which indicates what is the location of the replica. GlobuleReplicaFor must match a corresponding GlobuleReplicaIs directive configured at the origin server.
The GlobuleReplicaFor directive takes as parameters the URL of the origin site, and a password. The URL must contain a fully qualified host name, and refer to the whole path specified at the origin server (e.g., if the origin server exports http://www.mysite.com/myorigin/ then you cannot import only http://www.mysite.com/myorigin/subdir/ at the replica. The specified password must be the same as in the origin server's configuration, otherwise authentication will not work.
Note that, if the site has one or more backup(s), then they must be mentioned in each replica server's configuration using the GlobuleBackupForIs directive.
<Location /path> GlobuleBackupFor url secret </Location>
The GlobuleBackupFor directive is used to configure a backup server. This directive must be used within a <Location> or VirtualServer container which indicates what is the location of the backup. GlobuleReplicaFor must match a corresponding GlobuleReplicaIs directive configured at the origin server.
The URL must contain a fully qualified host name, and refer to the whole path specified at the origin server (e.g., if the origin server exports http://www.mysite.com/myorigin/ then you cannot import only http://www.mysite.com/myorigin/subdir/ at the backup. The specified password must be the same as in the origin server's configuration, otherwise authentication will not work.
<Location /path/> GlobuleReplicaFor origin-url secret GlobuleBackupForIs origin-url backup-url </Location>
The GlobuleBackupForIs directive is used at the replica servers to specify the list of backup servers they can access in case the origin is down. The first argument defines the fully qualified URL of the origin server. This should be the same as indicated in the GlobuleReplicaFor directive. The second argument defines the fully qualified URL of the backup.
<Location /path/> GlobuleRedirectorFor url secret </Location>
The GlobuleRedirectorForIs directive is used at the origin servers to specify one or more external redirector(s). Each GlobuleRedirectorForIs directive must match a corresponding GlobuleRedirectorIs directive at the redirector server.
If no external redirector is defined for a given site, then the origin server will perform the redirection itself.
External redirectors are useful because they allow clients to be redirected to replicas even though the master server is down. To build a reasonably fault-tolerant site, at least two external redirectors are necessary.
This directive takes two parameters. The url parameter defines for which site the redirector will be defined. The URL must exactly match an exported path as indicated by a GlobuleReplicate directive at the origin server. The second parameter is a password used for mutual authentication between the origin and the redirector servers. The same password must be present in both configurations.
The GlobuleDefaultReplicationPolicy directive defines the replication policy that should be associated to new documents. After a while, Globule will use the recorded access logs to this document to decide on which policy is best for this document.
Globule contains five different policies:
The only sensible replication strategy to be used for disconnected origin server operation is currently Invalidate.
The default is Invalidate.
GlobuleMaxDiskSpace is used at replica servers to define how much disk space may be used for storing the replicated site. If the size of the site is greater than the configured allowed space, then rarely-accessed documents will be removed from the replica. Note that Globule may exceed this limit temporarily while servicing a request.
GlobuleMaxDiskSpace takes one parameter, which is the size allocated for this replica in bytes. You may append a unit size such as ``kb'', ``mb'', ``gb'' or even ``gigabytes''. Note that the default is in bytes, so if you specify just ``100'' it will probably pose some problems as 100 bytes it probably too little for even a single document.
Default value is 50 MB.
Globule never stores documents in main memory. However, for performance reasons, it often stores information about documents (i.e., meta-documents) in memory.
GlobuleMaxMetaDocsInMemory is used at the origin, replica and backup servers to define how many meta-documents may reside in main memory. Note that Globule may temporarily exceed this limit while servicing a request.
GlobuleMaxMetaDocsInMemory takes one parameter, which is (as you guessed) the maximum number of meta-documents.
If you increase this limit, then you must probably also increase GlobuleMemSize, otherwise your server may crash.
The default value is 500.
Globule uses locks to synchronize access to shared memory. The number of locks can be set at a global level or can be overridden in an exported or imported section. A high number of locks will take more resources on your machine, but it will allow better multitasking at the Globule server.
Some Linux/Unix systems will restrict the number of locks that you can allocate. Instructions on how to check your current limits, and increase them if necessary, are available in Section 6.2.2.
The default value is 4 locks per exported or imported web-site.
<Location /> GlobuleReplicate on GlobuleDirectory /var/www/htdocs/somedir/.htglobule GlobuleReplicaIs ... GlobuleBackupIs ... GlobuleRedirectorIs ... </Location>
Globule needs to store information about documents on disk. Each origin, replica or backup configured at a given Globule server will need its own repository for meta-documents.
If no GlobuleDirectory directive is defined, then meta-documents are stored within the directory which contains documents, in a sub-directory called .htglobule. The GlobuleDirectory directive allows to set this directory to a different path.
GlobuleDirectory take as parameter the absolute path of the directory where meta-documents must be stored instead. If the specified directory does not exist, Globule will create it automatically. When Globule is started as root, it will transfer the ownership of the directories in the GlobuleDirectory to the user that will run the worker servers (i.e., the user specified via the standard Apache User directive.) The use of this directive is however discouraged and may be depricated in future.
GlobuleDatabase mysql://user#password@hostname/database secret GlobuleDatabase http://host/path secret
At the origin server the first syntax form is used to establish a connection to the actual database. The latter forms are used at the replica sites to tunnel queries to the origin server. In effect, both forms provide a method for executing queries on the database using a HTTP interface, to be used by Globule only (``mounting the database on a HTTP address'').
The GlobuleDatabase directive should be enclosed inside a
The shared password secret should be match between origin and all
replica servers, to shield the usage of the database interface by
5.3 Redirection Directives
For an introduction to redirection in Globule, please read section 3.3.
The GlobuleRedirectionMode directive defines which redirection mechanism should be used by a server.
GlobuleRedirectionMode must be defined before any exported or imported sections in your configuration. If any of the exported sections want to use DNS redirection, then you must enable DNS redirection at a global level. The individual sections can then override to use HTTP redirection only.
To use DNS redirection, you must use our patched version of Apache, so that it can handle UDP requests. Instructions on how to do that are available in Section 2.
The default value is HTTP.
Globule can use several redirection policies, that is, several ways to decide to which replica clients should be redirected to. The GlobuleDefaultRedirectPolicy directive allows you to select one policy out of these:
The default value is static.
To use the AS-based redirection policy, you must first download a view of the current routing information from the routeviews.org Web site. At ftp://ftp.routeviews.org/oix-route-views/ there is a directory for the current year and month, containing a file called oix-full-snapshot-latest.dat.bz2. You need to download the most recent file, and uncompress it using the bzip2 utility. You then make the file available to Globule via this GlobuleBGPDataFile directive.
Based on this file, Globule creates a map of the Internet, and then uses it to calculate the distance between clients and replicas. In general, only the most recent BGP data file needs to be downloaded from the RouteViews site. If the BGP data file is unavailable or corrupted, Globule will default to static redirection.
Default is /dev/null.
When using the AS-based redirection policy, you will want to keep your routing information up-to-date. The GlobuleBGPReloadAfter directive allows you to specify how often (in seconds) Globule should re-read its BGP data file.
By appending a unit such as ``min'', ``seconds'', ``days'', ``week'', you can
use the more human friendly numberic value.
5.4 DNS Redirection
A DNS query to a replicates site is responded by a set of IP numbers of the replica servers. Accompanying this data is also a Time-To-Live (TTL) field indicating how long the result may be cached by the browser (or Proxy, or an intermediate ISP DNS server) before it has to re-ask the same query again. The system contains default TTL values for different redirection policies. These defaults may be changed using one of the GlobuleTTL* directives.
Lower values will make your site more responsive to changes in the set of available replicas, but they will increase the load at your redirectors.
Attention: TTL values below 600 seconds may strongly reduce your redirection efficiency and is considered mal-practice on the Internet.
This sets the TTL value for all DNS responses regarding static sites (see GlobuleDefaultRedirectPolicy).
The default value is 86400 seconds (i.e., 1 day).
This sets the TTL value for all DNS responses regarding round-robin sites (see GlobuleDefaultRedirectPolicy).
The default value is 1800 seconds (i.e., 30 minutes).
This sets the TTL value for all DNS responses regarding AS-based sites (see GlobuleDefaultRedirectPolicy).
The default value is 600 seconds (i.e., 10 minutes).
Upon DNS queries, Globule selects a number of IP addresses based on the availability and redirection policy. Depending on availability and policy used the number of returned addresses is between 0 and count. The default is to try to return 3 IP addresses.
GlobuleMaxIPCount defines how many responses should be returned.
Note that HTTP redirectors can return only one response, so GlobuleMaxIPCount has no effect on them.
The default value is 3.
GlobuleDNSRedirectionAddress :port-number GlobuleDNSRedirectionAddress ip-address GlobuleDNSRedirectionAddress ip-address:port-number GlobuleDNSRedirectionAddress ip-address port-number
The GlobuleDNSRedirectionAddress directive allows to specify which port the DNS redirector should listen to and/or which specific ip-address to bind to. It serves the same purpose as the standard Apache Listen directive, but now for DNS requests.
This directive is mostly useful for debug purposes as it allows you to test DNS redirection as non-root user.
Note that, if you select any port number other than 53, then your redirector will not be accessible to regular clients.
This directive must be specified before the first RedirectionMode directive.
5.5 Periodic tasks
Globule needs to update its internal state on a periodic basis, among others to check for updated documents, whether replica servers are available and if there isn't a more optimal usage of resources.
Periodic tasks are triggered by a so-called heart-beat mechanism. The following directives allow you to control how fast Globule's heart should beat.
GlobuleHeartBeatInterval 120seconds GlobuleHeartBeatInterval 2minutes GlobuleHeartBeatInterval 3hours
This directive controls how often periodic tasks should take place. The most important of these tasks consists of checking whether replica servers are still alive. This should be done on a fairly frequent basis, otherwise it may take a long time before an unavailable replica is noticed and clients are no longer redirected to it.
The argument is in seconds or can be suffixed with an appropriate denominator (seconds, minutes, hours, days, etc.)
The default value is 120 seconds.
GlobulePolicyAdaptationInterval 1200seconds GlobulePolicyAdaptationInterval 20minutes GlobulePolicyAdaptationInterval 3hours
Every few heart beat events, Globule will reevaluate its choices of replication policies for each document. If the current policy for a document is no longer the best one, its replication policy will be switched to the new best one. Evaluating the policy is done on a per-document basis, but all documents for an exported path are evaluated in one go. The GlobulePolicyAdaptationInterval directive specifies the delay (in seconds) between policy re-evaluations.
Please note that the GlobulePolicyAdaptationInterval must be a multiple of GlobuleHeartBeatInterval. If it is not, then it will be rounded up to the next multiple.
Any declaration of the GlobuleHeartBeatInterval must precede the declaration of a GlobulePolicyAdaptationInterval.
Replication policies should not be re-evaluated too often, because otherwise the choices will be based on very few information. This will lead to sub-optimal performance. On the other hand, if it is set too high, then your system will take a long time to react to a change in access patterns.
The argument is in seconds or can be suffixed with an appropriate denominator. Specifying a GlobulePolicyAdaptationInterval value of 0 means that policies should never be adapted.
The default is 1200 seconds (20 minutes).
5.6 Obscure and rare settings
GlobuleAnythingFor url secret
With this directive you can declare that your server can be a replica for any other server, without going into a co-operative agreementship. The shared password named secret is in fact not used in this case, nor is the specified url. These are currently place-holders for the further use of this directive to use a common third-party broker.
Currently this directive should just be used as such:
ServerName world.cs.vu.nl <Location "/"> GlobuleAnythingFor "http://localhost" "geheim" </Location>
While an origin server should specify to use the replica-for-everyone server as:
ServerName wereld.cs.vu.nl <Location "/"> GlobuleReplicate on GlobuleReplicaIs "http://world.cs.vu.nl/http://wereld.cs.vu.nl/" "geheim" </Location>
This directive is due to change to use a broker site without notification and it's use is highly experimental at this stage.
GlobuleMirrorIs url weight
With this directive you can specify that a non-Globule enabled server is to be used as a replica server, which should use plain old-fashion mirroring to fetch the content from the origin server and configuration must be done manually. The url is the address of the mirror-site, while the weight has the same function as in a regular GlobuleReplicaIs definition, but is a required argument in the GlobuleMirrorIs directive.
Note that most of the advantages of Globule, such as consistency, merged access logs and accounting are lost. Therefor the use of this directive is highly discouraged. Globule will maintain the ability to use DNS redirection and makes a minimal check on the availability of the mirror-based server such to not redirect to a server which is obviously unavailable. It can not however do a full check on the availability.
We encourage the use of normal Globule enabled replicas and do not actively support the use of mirrors like can be specified with this directive.
GlobuleDisabledReplicaIs url secret
This directive does almost the same as the GlobuleReplicaIs directive, but the replica server being declared will never actively be used to redirect to. I.e. it is permanently held in the state of unavailable. This can be used to test replica servers, as they can authorize themselves to the origin server.
Using the GlobuleDisabledReplicaIs directive is similar, but not the same, as defining a replica server with weight 0. We encourage using the weight as a better way to take replica's out of the loop of available replica servers to redirect to.
This directive can be used in conjunction with an origin site specification to indicate that the server is an origin server, but the original documents do in fact not really reside on this server. When the server is requested to serve a document for which it (as an origin server!) has no longer a valid copy, it will try to download the document from a third-party. This third-party upstream server is not under a Globule controlled web-server. This approach is somewhat similar to proxying, but then only for a single web-site.
The GlobuleProxyFor directive is used in the following way to make a copy of
the web-site at
<VirtualHost *> ServerName wereld.cs.vu.nl <Location "/"> GlobuleReplicate on GlobuleProxyFor http://www.revolutionware.net/ GlobuleReplicaIs http://world.cs.vu.nl/ </Location> </VirtualHost>
Many advantages of Globule are lost in this way, therefor it is discouraged to use this directive except for demonstration usage. Problems will arise with the translation of links and the replication of dynamic content is not really possible.
When redirecting browsing users to one of the available replica servers, some redirection policies have be ability to redirect more often to certain replica servers than others. Each server is assigned a certain weight, servers with a heavier weight are loaded with more requests than others.
The GlobuleOriginWeight directive assigns a weight to the origin server. Optional parameters to GlobuleReplicaIs assign the weight of each of the replica servers. By default the weight of each server is set to 1.
The value weight must be an integer ranging from 0 to 32767. The default weight when not specified is 1. The weight parameters determin the spread of the load over the available web-servers. If the weight of this server is set at n and the sum of all the weights of all available servers is s then the optimal load for this server is taken to be n/s of the number of requests. The actual load may vary.
ServerName origin.revolutionware.net <Location "/"> GlobuleReplicate on GlobuleOriginWeight 1 GlobuleReplicaIs http://replica.revolutionware.net/ secret 9 </Location>In the above example it is declared that of each 10 requests, the origin should handle just 1, while the replica server should handle the remaining 9.
GlobuleStaticResolv dns-name A ip-address GlobuleStaticResolv dns-name CNAME dns-name
Globule automatically responds to DNS queries for sections for which it is an origin server or redirector and where DNS redirection is enabled. Any query to the name specified in the ServerAlias8 such an exported section is responded by with an A record of one or more available replica server. It is possible to add more entries to Globule, such that it can also respond to certain other queries with a fixed result. The GlobuleStaticResolv directive adds these entries.
There are two possible entries which can be added, either A or CNAME DNS
record types. An A record points to the indicated IPv4 address. A CNAME
indicates an alias pointing to another Fully Qualified HostName (FQHN).
Unlike syntaxes like Bind, the FQHN specified here should not have an dot
appended to it and is never relative to another domain.
One possible use of this is to use Globule also to resolve the origin and replica specific site-names. Normally one would have the specific site-name in the ServerName, and the generic site-name which is resolved by Globule as first in the ServerAlias. These specific site names should be static addresses of the origin and replica sites. As such they cannot be resolved by Globule itself and should be resolved by an external nameserver. This is why Globule is made only the nameserver (NS record) of the generic site, often www.yourdomain.com. GlobuleStaticResolv does allow you to make Globule the authorative nameserver of the entire domain by adding records like origin.yourdomain.com. However you must remember that all stand-alone redirectors must have the same data inserted into their configuration.
GlobuleStaticResolv origin.mydomain.com A 188.8.131.52 GlobuleStaticResolv replica.mydomain.com CNAME myfriend.his-isp.comThis directive is only available in case DNS redirection is compiled in (and into the Apache server).
5.7 Recognized units
For values indicating sizes in bytes you can append the following units:
For values which represent a time period, you can use the following units:
Additionally the string ``never'' (indicating a value of 0) is accepted by some Directives to turn off a certain feature.
Most directives are specified as a number of seconds, therefore any value smaller that that (for instance 1millisecond) is rounded upwards to 1 firstname.lastname@example.org
February 27, 2006