OCLC Openly Informatics

Scholarly Link Specification
Framework
S-Link-S

Outline

Scholarly Link Specification Framework (S-Link-S)

Purpose

The Scholarly Link Specification (S-Link-S) Framework is designed to facilitate reference linking to diverse targets. Previously, a database or journal publisher or library wanting to make links to an electronic resource had to first work out a linking agreement, then work out a method to interchange linking data, and finally have programmers implement the links. S-Link-S streamlines this process by providing a well-defined syntax and vocabulary for the exchange of the necessary information. A software module can then implement reference linking for a large number of target sites using a single software module. S-Link-S is widely used to link to publisher and aggregator sites from libraries and link servers.

OCLC Openly Informatics has developed and is using the S-Link-S specification as a basis for 1Cate Linking Engine software and related services to serve the scholarly information industry. This specification was made public from the very beginning, in 1998, in the form of a draft to seek comment and criticism from potential users of the software.

Openly's 1Cate software implements S-Link-S is available for licensing.

The Journalseek database icludes a library of over a thousand S-Link-S linking templates, and is also available for licensing.

This version

This public draft was published on July 11 , 2006 http://nj.oclc.org/SLinkS/SLinkS-20060711.html Changes are detailed below.

A previous version was published on September 28 , 2005 http://nj.oclc.org/SLinkS/SLinkS-20050928.html

The most recent version of this document can be found at http://nj.oclc.org/SLinkS/SLinkS.html

The S-Link-S template language, as described in its XML document type definition, has reached version 1.15, and can be considered stable enough to serve as a basis for software implementations.

Please note the statement of Copyright and Permitted Use of this document.

Design

The S-Link-S framework has two components,

  1. a URL templating language
  2. a metadata vocabulary.
The URL templating language tells how to construct a URL based on bibliographic data, while the metadata give information about what the URL leads to and how it may be used. The templating vocabulary is expressed using XML syntax conforming to an XML DTD, or Document Type Definition. The metadata is expressed using RDF (Resource Description Framework) syntax using a vocabulary defined in an RDF Schema. At present, the RDF metadata component of S-Link-S is not being actively maintained.

Software based on the S-Link-S specification can take bibliographic data and return a URL, or a POST form based on S-Link-S data.

Before discussing the workings of S-Link-S in detail, it's useful to sketch out the overall architectures of some example systems that could make use of S-Link-S.


The first example is pictured below. Here, it is imagined that S-Link-S specifications are collected and maintained in a centralized clearinghouse (such as Journalseek). To link references, Publisher A sends reference data to a S-Link-S processing engine. The processing engine uses a database of S-Link-S specifications to construct links pertaining to the references, and then returns them to the publisher. The publisher then provides these links to its users.

This model assumes that the links are relatively static, perhaps needing monthly or yearly updating. Since the links are static, the processing engine has time to look up individual articles from databases to fully resolve indirect links. Publishers wanting to specify link methods for their publications only have to deal with the clearinghouse, not all the other publishers.

central database schematic


In the second example, Publisher A has built or licensed a database of S-Link-S specifications. Included in these might be specifications obtained privately from another publisher. This publisher serves his content out of a database, building web pages on the fly, and uses a S-Link-S Engine to dynamically construct links. The dynamic linking allows this publisher to deliver links that expire in a few hours, which his agreement with Publisher C requires.

private exchange schematic

In a similar configuration, a 3rd party, such as a library, may operate a dynamic linking service. This vision for a Scholarly linking environment has largely come to pass with the advent of the OpenURL Link-Servers deployed by libraries. The 1Cate Link-Server is an example of such a use.


In the third example, Publisher B has licensed the full text of a journal to a library. Publisher A includes a generic link to an article in Publisher B's journal. The library's users access the web through a proxy server equipped with a S-Link-S based filter. When User A follows the citation link in publisher A's journal, the S-Link-S filter recognizes it as one that the library has subscribed to. An enhanced HTTP request is then forwarded to publisher B. User A gets full access to the journal without even knowing that her library has intervened on her behalf. Here a generic URL might be http://www.publisher.com/vol1/page35 and "enhanced" URL's might be http://harvard:password@www.publisher.com/vol1/page35 (password added) or http://www.library.edu/www.publisher.com/vol1/page35 (local holding).

intranet/library schematic

Link Templating

The basic idea for URL template strings is that most articles can be accessed using a template in which field place-holders are replaced by bibliographic data strings. Some simple manipulations of the resulting string may be required for generation of the final URL.

We represent the place-holders for bibliographic data items using XML general entities, which start with "&" and end with ";". For example, if the volume number is used in the URL, the volume number is denoted by &volume; Entity names are case sensitive.

Manipulations of strings are denoted using element mark-up of the template string. For example, to pad the volume number with 0's to make 3 characters, you would write <pad padChar="0" length="3">&volume;</pad>

Place-holders for functions of the bibliographic data are denoted using empty elements. These start with "<" and end with "/>". For example, the place-holder for an ISO format string (YYYY-MM-DD) formed from the publication date given by the bibliographics data is <parsedDate/>

An example of a S-Link-S Template

In this example of a S-Link-S template, we model a journal at http://www.publisher.com/ in which articles have URL's based on the volume and page
Volume Start Page URL
3 25 http://www.publisher.com/003/25/
10 485 http://www.publisher.com/010/485/

<?xml version="1.0"?>
<!DOCTYPE slinks SYSTEM "slinks.dtd">
<slinks ID="example">
	<URL>http://www.publisher.com/<pad padChar="0" length="3">&volume;</pad>/&startPage;/</URL>
</slinks>

XML is a syntax for structural markup of text. A S-Link-S template has a top level element, slinks, which contains all the template information. In our example, "<slinks ID="example">" denotes the beginning of a template element with the identifier "example". Templates need to have an "ID" so that they can be referred to in statements like "Use the template with ID=example to make links to the journal with ISSN=1234-5678" "</slinks>" signifies the end of the S-Link-S template.

If you're just skimming the specification, you should skip ahead to the description of metadata, as the next section starts to describe the S-Link-S Template elements in gory technical detail.

If you're studying the details of the specification, you'll want to look at the DTD (Document Type Definition), which is available at http://nj.oclc.org/SLinkS/slinks.dtd. There is also an HTML version. The next sections go through the DTD, element by element and entity by entity, and explain their purposes.


Top level container elements

Elements are chunks of text marked by tags enclosed in ">" and "<".

<slinks>
This is the top level element of the linking template. It is referred to in metadata using its ID attribute. It can contain "var" elements, "lookUpTable" elements, "scratch" elements, a "DOi" element, a "URL" element, a "postArgs" element, a "cookie" element, a "notRequired" element and "locator" elements, in that order. The order is chosen to enable single-pass parsing.
attributes:
ID
An ID used to refer to the S-Link-S template. (required)
vers
The version of S-Link-S . Should be "1" for the present specification.
complete
A complete template is self-contained and one can be used without a network connection; it needs no on-line look-ups or digital signatures. This attribute can be inferred from parsing the file; the attribute is provided to allow applications to store the result for the benefit of subsequent parsing. (yes | no) optional.
resultType
a place to put information about what the template resolves to, i.e. "article", "abstract", "search", "volume", "issue", "homepage". Optional.

A "slinks" element can contain different types of templates. The most common link will be the URL, which is what you type into your web browser to get someplace on the internet. Occasionally, Specialized links may require form elements to be entered using the arguments of the "HTTP POST" or "HTTP-Cookie" headers. In the case of POST argunments, the template defines a set of name-value pairs. An emerging technology for linking to digital resources is the "Digital Object Identifier". Templates for DOi's (lower case i is used to indicate that you're talking about the actual DOI string) go in the DOi element.

<URL>
This element contains a template for a URL. In this and other elements declared as mixed content, white space is preserved. URL encoding should be assumed to occur only after all parsing and manipulation has been done.
attributes:
usage
a string describing how the URL is to be interpreted.
possible values:
literal
If a template is literal, the string it forms is to be used as the URL etc.
redirected
A redirected template forms a URI which may get redirected to another URI. The redirection URI is to be used as the "result" of the element. Multiple redirections should be followed to the end.
query
A query URL is to be used only with a locator element to do web-page look-ups.
In principle, we could add this to the DOi element, but that would require adding handle resolving code to the implementation.
<DOi>
This element contains a template for a digital object identifier. When this element is present, it can be used in a URL template using the <getDOi/> element. If the DOi must be retrieved from a database, use a locator element.
<postArgs>
Occasionally, a server may require form data to be submitted in a post arguments header. This element contains one or more <postItem> elements which contain the required form data. Implementing software may choose to provide a user with an HTML form containing the data, or perhaps a javascripted form submission.
attributes:
encoding
Occasionally, a server may require form data to be submitted in a post arguments header. This element contains the items for the required form data. The encoding attribute indicates the encoding expected by the target
Default: "UTF-8"
<postItem>
This element contains a template for a value of a Post argument.
attributes:
key
The name of the POST argument.
isCheck
(optional,(true|false) default: false) whether a Post argument represents a checkbox or not. The reason this is needed is because web browsers treat checkboxes differently from other inputs. The key is not sent when the box is not checked; as a result, many cgi resolvers just check for the presence/absence of the key.
<cookie>
Session ID's and user identification is often accomplished using a special header in the HTTP protocol called the HTTP-Cookie. The cookie element is included in S-Link-S to enable specification of site authentication in the intranet/library service scenario. The cookie element should never be used in generic or public link specifications.

Example:
<slinks ID="example2" usage="literal">
	<DOi>(the contents of the DOi element)</DOi>
	<URL>(the contents of the URL element)</URL>
	<postArgs><postItem name="param1" >value1</postItem></postArgs>
</slinks>
<notRequired>
Experience has shown that an important bit of information for use of S-Link-S template is knowledge of which elements are and are not required. Normally, the interpreter can deduce which bibliogrpahic data is required, but occasionally it helps to tell the interpreter explicitly what is required. Any entities required to compute the string templated in the notRequired element are considered to be not required in the URL

Locators

<locator>
The "locator" element can be used to find a string on a web page retrieved a URL identified by the query attribute. Its content is a "PERL 5 Regular Expression" which can be used to search for a text pattern on a web page. Note that the characters ">", "<", and "&" must be escaped with entities, "&gt;", "&lt;", and "&amp;" in the content of this element.
attributes:
name
a name of the thing that is being sought (required)
group
if there is more than one parenthesized group, then this selects which group. if group = 0 then the whole match is used. Default: "1"
query
The query attribute identifies an element to be used as the query string for the locator. Default: "URL". If the value is "URL", then the URL template is used to copmpute a URL which is used to retrieve a web page off the internet. If the value of query is the varID of a "var" element, the content of the var element is used as the retrieval URL. If the value of the query attribute is the name of another locator, than the result of that locator is used as the retrieval URL. By using this attribute in multiple locator elements, a chain of queries can be made. This is useful when the result of a query contains URL's to pages which contain the information being sought.
Example (this locator will find the PubMed ID in a PubMed Search Result):
	<locator name="pmid" group="1" query="URL">PMID: +(\d+)</locator>
Example 2:
<?xml version="1.0"?>
<!DOCTYPE slinks SYSTEM "slinks.dtd">
<slinks ID="publistquery" complete="no">
	<URL usage="query">http://www.publist.com/cgi-bin/search?SearchType=Adv&amp;Title=&amp;ISSN=<replace for="-" with="">&ISSN;</replace>&amp;Desc=&amp;Pub=&amp;MaxHits=10&amp;SortBy=Format&amp;Format=1</URL>
	<locator name="URL2">HREF="/cgi-bin/(show\?PLID=\d+)"</locator>
	<locator name="title" query="URL2">Title:&lt;/B&gt;(.+)&lt;/TD&gt;</locator>
	<locator name="subject" query="URL2">TARGET="SubjectListWin"&gt;(.+)&lt;/A&gt;</locator>
	<locator name="publisher" query="URL2">TARGET="PubWin"&gt;(.+)&lt;/A&gt;</locator>
</slinks>

This example shows how to use chaining to look-up data items from the PubList website. The first locator uses the the URL element to supply the initial search URL. The web page returned for this URL contain links to other web pages, and these URL's are matched using the Regular Expression in the locator element's content. The rest of the locators have a query atribute of "URL2", which is the "name" attribute of the first locator, and so the match result of the first query is used as the query URL for these locators. The web page at URL2 is retrieved, and the regular expresions in the locators are used to extract the data items.

Variable Containers

<var>
contains a template for a variable string that other elements may use.
attributes:
ID
The label used to refer to the variable section. Must be unique in the document.
example:
<var ID="a5g16">Volume &vol;</var>
<lookUpTable>
together with lookup, this element allows almost any site to be described in the S-Link-S framework. Nonetheless, it's bad practice to use the lookUpTable in place of a search engine if you need a table item for every item in your journal. item is the only allowed content.
attributes:
ID
a unique identifier used to address the lookup table (required)
default
The value returned if no match is found default:""
<item>
a record in a lookUpTable. (empty)
attributes:
key
a string used to access the record. If there are two items with the same key, then only the first is used.
value
The string returned by a look-up
<scratch>
is a var by another name. It contains a template for a variable string that other elements may use. Scratch elements can be used as calculation registers so that a template may build up a calculation.
attributes:
ID
The label used to refer to the scratch section. Must be unique in the document.
example:
<scratch ID="volval"><param name="a5g16"/></var>

Bibliographic data place-holders

"Bibliographic data place-holders" indicate where bibliographic inputs are to be substituted in a template string. Since XML syntax is used in S-Link-S , XML "SYSTEM" entities are used to represent most of these tokens. For example, if a publisher wants to specify that links to an article on page 587 in volume 3 should use the URL http://www.publisher.com/3/587/ , the string " http://www.publisher.com/&volume;/&startPage;/" would be used. Entities always start with "&" and end with ";". The names are case-sensitive.

(Technical Note: A few place-holders are expressed as empty elements (such as <SICI/>) S-Link-S uses entities to represent input strings, and uses elements to represent text strings which are the result of processing input strings. A preferred implementation may be to insert data for entities in "the internal subset" of a template document type declaration.)

General rules for place-holder normalization

Bibliographic data is usually produced by humans, and therefore can acquire "noise". For example a journal may have a page number "L - 123". Authors may transcribe this as "l123", "p.L123", "pp:L123". For this reason, the standard normalizations of the bibliographic input strings are specified to try to remove some of the noise. ("1123" and "123" are also likely, but we can't do much there.)

The normalization operations should take place in an order specified for each entity. In general, the normalizations are irreversible. For example, if the first author's name is R. McDonald, &authLast; is "mcdonald". You should not require "McDonald" in a case-sensitive URL, because linkers may only know that the name is "MCDONALD".

The defined normalizations are:

replaceSlash
Occasionally, a data string may have a "/" in it; most servers treat "/" as a path delimiter. In such cases, "/" should be replaced with "-" in the many place-holders. This is denoted in the following list as the slashReplace property.
removeWhiteSpace
White space (as defined in the XML spec) should be removed in data place-holders which specify this normalization.
underscoreWhiteSpace
All instances of one or more white space (as defined in the XML recommendation) should be replaced by a single underscore character in data place-holders which specify this normalization.
removePunctuation
The following punctuation should be removed from place-holder strings: "#", ",", ".", ":", "(", ")", "[", "]", "{", "}", "!", ";", """ . and replaced with white space in data place-holders which specify this normalization.
trimPunctuation
Punctuation (enumerated above) and whitespace are removed from the end and beginning of the place-holder string.
removeStrings
Many entities should have certain strings removed for normalization. The list of strings should be removed in order.
lowerCase
The data string should be converted to lower case.
toAscii
Accented and special characters should be replaced by the closest unaccented equivalents. A unicode to ascii table for use in such conversions is available at http://nj.oclc.org/SLinkS/unicodeMap.txt

An S-Link-S implementation should normalize bibliographic data on entry. In cases where specific formats are required, a S-Link-S implementation should consider malformed or ambiguous entries to be invalid. A S-Link-S implementation may apply correct formatting, such as adding a dash to ISSN's.

URL Encoding and White Space Handling

In elements declared as mixed content (anywhere you have character data.), white space is preserved. URL encoding should be assumed to occur only after all parsing and manipulation has been done.

General Bibliographic data place-holders

&baseURL;
The base URL for the journal. This URL should start with a protocol declaration (usually "http://"). It can be set from S-Link-S metadata using the "baseURL" Property of a WebService Normalization: removeWhiteSpace
&volume;
A string denoting the journal volume. Normalization: lowerCase, removeStrings : {"volume", "vol"}, ReplaceSlash, trimPunctuation, removeWhiteSpace
&issue;
A string denoting an issue number. Normalization: lowerCase, removeStrings {"issue", "iss", "no" , "number" , "num"}, ReplaceSlash, trimPunctuation, removeWhiteSpace
&pages;
A string denoting the page numbers. &pages; =&startpage;-&endpage;
&startPage;
A string denoting the page on which an article starts. Normalization: lowerCase, removeStrings {"pages", "page", "no" , "number" , "num"}, ReplaceSlash, trimPunctuation, removeWhiteSpace
&endPage;
A string denoting the page on which an article ends. Normalization: lowerCase, removeStrings {"pages", "page", "no" , "number" , "num"}, ReplaceSlash, trimPunctuation, removeWhiteSpace
&pSeq;
When more than one articles are found on a page, this string denotes which article on the page is referred to. The &pSeq; string is most commonly a letter. Normalization: lowerCase, ReplaceSlash, trimPunctuation, removeWhiteSpace
&artNum;
A string denoting an article number in cases where there are no pages. Normalization: lowerCase, removeStrings {"pages", "page", "no" , "number" , "num"}, trimPunctuation, removeWhiteSpace
&itemNumExact;
A string denoting an item number for databases, archives, reports, patents. Normalization is minimal to allow for a wide variety of item number formats. Normalization: removeWhiteSpace
&itemNum;
A string denoting an item number for databases, archives, reports, patents. Normalization is maximal, and should be used for simple item numbers. Normalization: lowerCase, replaceSlash, removeStrings { "number" ,"no." ,"no" , "num.","num", "#"}, removeWhiteSpace
&ISSN;
The ISSN of the journal. If there is a separate ISSN for an on-line version, this place-holder refers to the print version ISSN. The match string for an ISSN is "[0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9x]". The last character is a check digit. It should be set from S-Link-S metadata using the "ISSN" Property
&eISSN;
The ISSN of an on-line version of a print journal. Same format as &ISSN;. It can be set from S-Link-S metadata using the "eISSN" Property
&CODEN;
The CODEN of the journal. The match string for a CODEN string is "[A-Z][A-Z][A-Z][A-Z][A-Z][A-Z0-9]". The last digit is a check digit. It should be set from S-Link-S metadata using the "CODEN" Property
&ISBN;
The International Standard Book Number for a book. The match string for an ISSN is "[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9x]". The last digit is a check digit. Hyphens are omitted.
&ISBN13;
The 13-digit International Standard Book Number for a book. Hyphens are omitted.
&doi;
The digital object identifier for an article, where it is known a priori. This is different from getDOi element (below) which is meant to trigger a lookup from a doi database such as crossref or to use a computed doi from the DOi template element. The practical difference is that the entity triggers a request in the host software, whereas a getDOi triggers a request in the S-Link-S framework. I expect that getDOi will be deprecated.
&jKey;
Many publishers use the same template string for all their journals, and use a key string to distinguish among them. The key string has to be declared in the journal metadata.

Publication Date Placeholders

A S-Link-S implementation should handle dates specially. If the bibliographic token "month" is entered as "January" then a template calling for "mo" should get "1", and "ssn" should get "winter", etc.
&year;
A string denoting the year of publication. The match string is "[0-2][0-9][0-9][0-9]". It's probably safe to assume that most journals published before 1 A. D. are not on-line.
&yr;
A two-digit publication year string. The match string is "[0-9][0-9]". Publishers are admonished not to use this because of Y2K. "00" shall be interpreted to mean "1900"
&month;
A string representing the month of publication. This string should match "(january|february|march|april|may|june|july|august|september|october|november|december)"
&mon;
A 3-letter string representing the month of publication. This string should match "(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)"
&mo;
A 2-digit string representing the month of publication. This string should match "(01|02|03|04|05|06|07|08|09|10|11|12)"
&day;
A 2-digit string representing the day of publication. This string should match "[0-3][0-9]"
<parsedDate/>
An ISO format string (YYYY-MM-DD) formed from &year: or &yr;, &mo; or &month;, and &day;. Note that this is an (empty) element rather than an entity because it is not an input, but rather a string computed from the inputs. Note that the parsedDate element can also be used to format date strings. (see below)
&ssn;
A string representing the season of publication. This string should match "(winter|spring|summer|fall)"
&quarter;
A string representing the quarter of publication. This string should match "(1|2|3|4)"
&authLast;
A string representing the first author's last name. Normalization: toAscii, lowerCase, removePunctuation, underscoreWhiteSpace.
&authInit;
A string composed of the first author's first and middle initials. Normalization: toAscii, lowerCase, removePunctuation, removeWhiteSpace.

Input-only place-holders

These place-holders should not be used in URL specifications, because their representations are not generally unique. They are often useful, however, for disambiguation or search and are inputs needed for SICI generation.

&uTitle;
title of the item, represented as a Unicode string.
&aTitle;
title of the item, represented as a 7-bit ascii string.
&jTitle;
title of the journal containing the item, represented as a 7-bit ascii string.
&uAuthLast;
First Authors's Last Name, represented as a Unicode string.

Public database key place-holders

Certain article databases are sufficiently public, accessible or widely used that their keys may be suitable for use as place-holders in link URLs. In other words, S-Link-S software should know where to look these up.

<pmid/>
The NCBI PubMed unique identifier number for the article. The pmid lookup is accomplished using the pmid.xml template file.
<getDOi/>
The Digital Object identifier for the article should be retrieved and placed here. If there is content in the DOi element, it will be placed here, otherwise, the DOi.xml S-Link-S template will be used to do the look-up. At the present time, there is no global DOi look-up facility. The DOi.xml file currently provided uses the Wiley DOi server to look up Wiley DOi's. For internal use with S-Link-S Calculator, the DOi.xml default template may be altered to accomplish private DOi database lookup.

String manipulation mark-up

S-Link-S templates may make use of these text manipulation functions.

<pad>
used to markup strings which need padding to make a fixed-length string
attributes:
padChar
a character to pad with (default:"0" zero)
length
how long the padded string should be (required) (an integer). If length is shorter that the string to be padded, then the string is chopped.If length is not an integer (more precisely, if Integer(length) throws a NumberFormatException), pad will do nothing.
align
The side the text should align to (left|right) (default:right)
examples:
<pad padChar="0" length="3">2</pad> becomes "002"
Here pad is used to chop:
<pad align="left" length="1">1999</pad> becomes "1"
<replace>
substitute one string for another in the element content
attributes:
for
The string to replace
with
The string to substitute
grep
whether to use PERL5 regular expressions "(yes|no)" default:"no"
example:
<replace for="1" with="one">12</replace> becomes "one2"
<changeCase>
change case of the text in the element content
attributes:
to
(upper|lower|title) the text can be changed to UPPER case, lower case, or Title Case. Title case treats all non-alphanumerics as word separators, and then capitalizes the first letter of each word. (required)
offset
characters up to the offset character are unchanged. default: 0
Example:
<changeCase to="upper">r1260</changeCase> becomes "R1260"
<encode>
URLencode the text in the element content
attributes:
encoding
for non-ascii characters, a target may expect a particular encoding; this attribute allows the template to specify a particular encoding . (Default: "UTF-8")
Example:
<encode>That's all folks!</encode> becomes "That%27s+all+folks%21"
<if>

This element implements conditional sections, and can contain "case", "match", "notEmpty" and "else" elements. case and match contain boolean conditions as attributes. notEmpty is true if it is not empty, and else is always true. The content of the first element with a true condition is selected, and remaining conditional elements need not be evaluated.

In some situations where you think this logic is appropriate, you may want to consider using separate S-Link-S templates and set validity ranges in the metadata using the starts or startDate property.

<case>
The attributes define a comparison of strings. If true, the content of this element becomes the value of the parent if element
attributes:
varID
The ID of a variable section to use as the left hand value for a comparison.
op
a comparison operator, matches "(gt|lt|eq|ne|ge|le)"
const
The right hand value for a comparison.
order
(numeric|alpha|date) default: numeric. In numeric comparison, the numbers are extracted from the strings before comparison. In a date comparison, both strings should be either YYYY-MM-DD format strings or they must be simple token values, i.e. "july" . date string parsing is lenient. .In alpha comparisons, the compareTo method of the Java String object is used, with the English Locale.
<else>
like case, but always true
example:
<var ID="v">&volume;</var>
...
<if>
	<case varID="v" op="ge" const="3">V&volume;</case>
	<else>1-2</else>
</if>

<match>
tests a match of one string in another
attributes:
with
The match string
varID
The ID of the "var" or "scratch" element containing the string in which to search for the match string
grep
whether to use PERL5 regular expressions "(yes|no)" default:"no"
example (here we check if the startPage has an "l" or an "L" in it, and change the URL accordingly):
<var ID="p">&startPage;</var>
...
<if>
	
<match varID="p" with="[Ll]" grep="yes">letters/&page;</match>
	<else>articles/&page;</else>
</if>
<notEmpty>
notEmpty is true if its content is not equal to the empty string. This has proven to be useful in cases where you want to use placeholder text when a token is not provided.
example:
<var ID="v">&volume;</var>
...
http://www.site.com/query?issue=<if>
	<notEmpty>&issue;</notEmpty>
	<else>all</else>
</if>
<option>
The contents of the option element may be redundant and can be omitted if it contains a placeholder which was not resolved.
example:
<option>&issue;/</option>

<pattern>

specify a pattern for the element content. Specifying patterns will improve the reliability of link formation and is absolutely essential for the library/intranet scenario where generic link URL's must be recognized and replaced with enhanced link URL's.

attributes:
model
A PERL5 regular expression that the element content should match.
example:
<pattern model="l?[0-9]+">&volume;</pattern>

This example expresses that the &volume; placeholder should be either a number or a number preceded by the letter "l". Note that the pattern element is descriptive, not prescriptive. It has no effect on the enclosed text, except that a processor can use it to flag errors. In this example, if the capital letter "L" was used in stead of "l", the standard normalization (lower-case) for &vol; would contradict the model.
<lookUp>
looks up a value in a lookUpTable using the content of this element as the key. The entire element is replaced by the returned value. Look-up is case sensitive
attributes:
ref
The ID of the lookUpTable to use
example:
<lookUpTable ID="yrs">
	<item key="1991" value="old/5"/>
	<item key="1992" value="old/6"/>
	<item key="1993" value="old/7"/>
	<item key="1994" value="papers/8"/>
	<item key="1995" value="papers/9"/>
	<item key="1996" value="papers/10"/>
</lookUpTable>
<lookUp ref="yrs">&year;</lookUp> returns "old/7" when &year; is "1993"

Functions of Variable Text.

The "hash" and "checkSum" elements are functions of a text string which is either in a "var" element referenced by the "varID" attribute, or of what we call the current text. The current text is the text resulting from parsing the character data and elements which occur in the parent element before the relevant hash or checkSum element. It's easier to illustrate by example:
<URL>&baseURL;volume=<pad length="3">&volume;</pad><checkSum/>1234.html</URL>

Here the underlined part is the "current text for the checkSum element.
<hash>
an empty element which is replaced by an MD5 hash (expessed in hexadecimal, capital letters) of the targeted marked element
attributes:
varID
The ID of a marked target section
example:
<hash varID="1"> becomes "29B0FE973D179E0E5B147598137D28CF"
<checkSum>
an empty element which is replaced by a= checksum of the targeted marked element. Supported algorithms are: "mod37", which is useful for alphanumeric strings and is specified in Z39.56-1996 (The version 2 SICI). Other useful checksum algorithms may be added as experience warrants.
attributes:
varID
The ID of a marked target section
type
The name of the algorithm used to calculate the checksum. (mod37).
example:
<var ID="cs1">0066-4200(1990)25&lt;&gt;1.0.TX;2-</var>
<checksum varID="cs1">
becomes "S"
(In this example, note that "<>" must be escaped with "&lt;&gt;")
<parsedDate>
An ISO format string (YYYY-MM-DD) formed from either the publication date (pubDate), the date of processing (today) or the date and time of processing as yyyy-MM-dd:HH:mm:ss (now).
attributes:
when
The date to be formatted. (pubDate | today | now) default :pubDate.
example:
<parsedDate when="today"/>
becomes "2005-08-03"

Commerce and security place-holders

One advantage of specifying rules for link construction instead of just exchanging tables of links is that you can ask linkers to embed extra information. in the links they construct.

&linkerID;
This is an identifier which, by mutual arrangement, can be used to identify the linker. A null string will be substituted when there is no arrangement between linker and linkee. For an example of how this might be used, consider the Amazon.com associates program. The URL in this case would be described as http://www.amazon.com/exec/obidos/ISBN=&ISBN;/&linkerID;/ . Remember that since the resulting URL can be bookmarked, linkerID is only useful in situations where it benefits the linker to add the place-holder. i.e. you can use this to pay for people to link to you, but you can't use it to charge them.

Although you might think this might be useful for source tracking, it's really only useful for that if you add a signature as well...

<private/>

This is a place-holder which can be used to implement private arrangements. It is an element to facilitate external function calls by a S-Link-S engine.

As an example of how this might be used, consider a case where, as part of a business arrangement, a publisher wants links to be perishable. To make the links perishable, the publishers agree to set <private/> to be equal to today's string YYYY-MM-DD, and then use a template like http://www.publisher.com/&volume;/&startPage;/<private/><dsig/>. The linked publisher then verifies the hash and rejects invalid URLs. This URL cannot be bookmarked.

attribute:
data
whatever extra data is needed for the private arrangement.
example:
<private data="YYYY-MM-DD"/>

In principle, publishers can use the private place-holder to exchange other sorts of information. It is expected that future versions of this specification will include additional specific information exchange place-holders as the uses of these become clear.

<param/>

This is a a way to insert arbitrary named parameters into a URL. if the named parameter cannot be found, the S-Link-S interpreter will behave as though an entity named "foo" was not found, so option elements can be used.

As an example of how this might be used, consider a case where a link server wants to pass through a parameter "foo" supplied on a GET query.

	http://www.aggregator.com/get?foo=value&issn=1213-3423
	                   -->
	    http://www.publisher.com/1213-3423?foo=value
	.
	 
while "issn" is a well known parameter name, "foo" isn't.

attribute:
name
the name of the parameter.
example:
http://www.publisher.com/&ISSN;?foo=<param name="foo"/>

This element was added to assist with parameter passing in OpenURL link servers.

<dsig/>
dsig is a 128 bit hex-coded (32 character) integer which is the "MD5" one-way hash of the referenced variable or the current text of the parent element, with the linkee's password appended to it. This can be used to implement "digital signing" of a URL.

An S-Link-S implementation will have a centralized signature authority which authenticates a user and adds the linkee's password to the signed string before computing and returning a hash string to the linker.

attributes:
varID
ID of the variable to sign

There are a number of security and commerce possibilities that have not been included.

  1. Encryption Encryption has not been included because in order to make sense, there needs to be someone to hide the information from. Links, by their nature, are rather public and should be storable and exchangable. If a linkee wants a linker to communicate something privately, putting the data in encrypted links is probably the silliest way to accomplish it.
  2. Immovable Links Using hashes, it is possible to make links that work only from specific pages. The principle effect of this would be to annoy users, since determined robots could easily spoof the referring page.
  3. Mechanisms to Charge the Linker Again, if a linkee wants to collect money from people linking to them, it's much easier to do this by private agreement.

SICI-related markup

A SICI element is defined to allow publishers to use SICI's in URL's and DOI's. The titleCode element is for SICI support

<titleCode/>
an ascii string derived from the title of the item using the rules specified in ANSI/NISO Z39.56-1991 or ANSI/NISO Z39.56-1996.
attributes:
vers
(1|2) default: 2. the SICI version. vers="1" corresponds to ANSI/NISO Z39.56-1991; vers="2" corresponds to ANSI/NISO Z39.56-1996
example:
when &aTitle; is "Characteristics of InSb Photovoltaic Detectors at 77 K and Below", <titleCode vers="1"/> becomes "CIPD"
<SICI/>
an empty element which is replaced by SICI strings according to the specified attributes .
attributes:
vers
(1|2) default: 2. the SICI version. vers="1" corresponds to ANSI/NISO Z39.56-1991; vers="2" corresponds to ANSI/NISO Z39.56-1996
titleCode
whether to include the Title Code (yes|no) in the contribution segment. default: "no".
enumeration
(v|vn) Default: "v". This attribute specifies the required level of detail in the enumeration string.
chronology
(year|yearMo|yearMoDa|yearQ|yearS) Default:year. This attribute specifies the detail required for the chronology string.
CSI
(1|2|3) Default "2". the SICI-2 code structure identifier. CSI="1" is the SICI-2 for a journal issue; CSI="3" contains private codes.
DPI
(0|1|2|3) Default:"0". The SICI-2 Derivative Part Identifier. DPI="0" is a contribution, DPI="1" is a table of contents, DPI="2" is an index, DPI="3" is an abstract.
MFI
(TX|TL|TH|TS|TB|CD|CF|CT|CO|HE|HD|SC|VX|ZN|ZU|ZZ) Default:"TX". The SICI-2 "Medium/Format Identifier".
example:
when &ISSN; is 00368075, &year; is 1992. &vol; is 256 and &page; is 784, <SICI/> becomes "0036-8075(1992)256&lt;784&gt;2.0.TX2-Z"

Note that the "&", "<", ">" and '"' characters may need to be escaped. If you use a SICI in a URL, you'll need to deal with this on the receiving end.

Changes

Many of the changes have been made in response to feedback and criticism from members of the publishing community. We are extremely grateful to Mark Doyle, David Ephron, Arthur Smith, Steve Hitchcock, Herbert Van de Sompel and Dan Connolly for comments and criticisms. Miles Poindextexter at Openly has also participated in the refinement of S-Link-S.
10/26/98 First version on the web
10/28/98
10/29/98
10/30/98
11/2/98
11/7/98
11/24/98
12/8/98
2/8/99
2/23/99
Lots of Changes made.
3/11/99
5/5/99
S-Link-S Template language thoroughly revised to reflect the implementation in code. The template language is at the "final call" stage and will soon be frozen.
6/2/99
S-Link-S Template language reaches version 1.0. Minor changes were made in order to move the functions of the "Resolution" metadata into the template language.

Changes in metadata:

Changes in template language:
6/22/99
Corrections made in RDF example.
7/14/99
No changes have been made in the template language, but the explanation of locators has been elaborated.

Extensive changes in the metadata schema have been made, partly to reflect the emerging form of the S-Link-S Legal Framework, and partly to weed out properties that were not being implemented.

8/26/99
11/10/99
12/7/2000
3/20/2001
Corrections courtesy of John Punin
7/11/2001
Additions to deal with the real world
5/26/2002
corrections
5/4/2003
Additions
8/2/2004
Additions
8/2/2005
Overdue modernization
9/28/2005
Additions
7/11/2006
Revisons

openly logo

Copyright and Permitted Use

Author: Eric S. Hellman, eric@openly.com

Copyright (1998-2006) by OCLC Online Computer Library Center, Inc.

All Rights Reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  3. The name of the author may not be used to endorse or promote products derived from this software without specific prior written permission.
  4. "S-Link-S" is a Service Mark of OCLC Online Computer Library Center, Inc. The S-Link-S name and logo may not be used to endorse or promote products derived from this specification without specific prior written permission of OCLC Online Computer Library Center, Inc..

THIS INFORMATION IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.