GraphML-Time: Difference between revisions

From visone manual
Jump to navigation Jump to search
Line 24: Line 24:
# are siblings (children of the same element);
# are siblings (children of the same element);
# and their lifetimes include <math>t</math>.
# and their lifetimes include <math>t</math>.
To resolve such conflicts we specify that the value of an attribute for a particular element at a time point <math>t</math> is defined by that <data> element that competes at time <math>t</math> that comes last in the document. The intuition for this convention is that the document is read from the start to the end and whenever a data element is read it defines a value for some element and attribute at the specified lifetime - thereby potentially overriding previous values that have been set for the same element/attribute within this lifetime.
To resolve such conflicts we specify that the value of an attribute for a particular element at a time point <math>t</math> is defined by that <data> element that comes last in the document. The intuition for this convention is that the document is read from the start to the end and whenever a data element is read it defines a value for some element and attribute at the specified lifetime - thereby potentially overriding previous values that have been set for the same element/attribute within this lifetime. Nevertheless, GraphML documents SHOULD avoid to include competing <data> elements.


== Datatypes for time information ==
== Datatypes for time information ==

Revision as of 13:53, 26 August 2013

This page is a working draft for a new GraphML version that allows to store time-dependent graphs and networks in GraphML files.

Note that the specification on this page does not (yet) describe an official GraphML release and is likely to change frequently and without notice.

Goal

The new GraphML version should provide the possibility to store time-dependent graphs and networks in GraphML files. The time-dependence should not be restricted to a series of snapshots of graphs but should allow for evolution in continuous time.

Design decisions

All GraphML elements can be indexed by a lifetime which is a subset of the timeline. The timeline is a linearly ordered set of timepoints, i.e., any two timepoints are either equal or one is less than the other (the smaller point in time happens earlier). Intuitively, the timeline can be thought of as a straight line going from left (past) to right (future). There are two variants of the timeline representing numeric time and calendar time. For numeric time the timeline is identical to the set of decimals. For calendar time the timeline is identical to the timeline of the XML datatype dateTime. There MAY be an affine linear mapping from the numerical timeline to calendar timeline by specifying an offset and a time unit. Numeric time does not need to represent physical time or astronomic time; it may also just be a counter for observation time points, etc.

The interpretation of the lifetime indexing GraphML elements is that the element exists for all time points and it does not exist for all time points . For elements the lifetime means that the particular attribute takes the value specified by for all time points ; it may take another value or it may be unspecified for all time points .

Time information is stored in XML attributes. The names of all time-attributes start with the string time.

Time attributes always restrict the lifetime of elements, that is:

  • If no time attributes (explictitly or implicitly) are associated with an element, then this element has an unrestricted lifetime, i.e., its lifetime is equal to the whole timeline.
  • Lifetimes default to lower elements in the document tree; for instance, the lifetime of a <node> can only be a subset of the lifetime of its parent <graph>.
  • The lifetime of elements having a keyref can only be a subset of the lifetime of the referred elements. For instance, an <edge> cannot live longer than its endpoints (<node>s) and the lifetime of a element can only be a subset of the lifetime of its <key>.

Two elements are said to compete at time if they

  1. refer to the same <key>;
  2. are siblings (children of the same element);
  3. and their lifetimes include .

To resolve such conflicts we specify that the value of an attribute for a particular element at a time point is defined by that element that comes last in the document. The intuition for this convention is that the document is read from the start to the end and whenever a data element is read it defines a value for some element and attribute at the specified lifetime - thereby potentially overriding previous values that have been set for the same element/attribute within this lifetime. Nevertheless, GraphML documents SHOULD avoid to include competing elements.

Datatypes for time information

There are two fundamental data types for time information: data types for time points and data types for durations; these types are identical for numeric time. Since guessing the data types from lexical representation is in general not possible, GraphML documents SHOULD explicitly state which data types they use for time points and durations by using the attributes time.point.type and time.duration.type (see below).

Time point data types

Note that some types for time points also have a duration (such as gYear); some are recurrent (such as time); and some have a duration and are recurrent (such as gMonthDay).

The specification of a simple type for time points is given below.

 <xs:simpleType name="time.point.simpleType" final="#all">
   <xs:union>
     <xs:simpleType>
        <xs:restriction base="xs:short"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:int"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:long"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:integer"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:decimal"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:float"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:double"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:dateTime"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:time"/> <-- recurs (time of the day) -->
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:date"/> <-- has a duration -->
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gYearMonth"/>  <-- has a duration -->
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gYear"/>  <-- has a duration -->
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gMonthDay"/>  <-- recurs and has a duration -->
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gDay"/> <-- recurs and has a duration -->
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gMonth"/> <-- recurs and has a duration -->
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:string"/> <-- user-defined representation of date, time, or dateTime-->
     </xs:simpleType>
   </xs:union>
 </xs:simpleType>

We also need the simple data type list of data types for time points; this is achieved as below:

 <xs:simpleType name="time.points.simpleType" final="#all">
   <xs:union>
     <xs:simpleType>
        <xs:list itemType="xs:integer"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:list itemType="xs:decimal"/>
     </xs:simpleType>
     ...
     <xs:simpleType>
        <xs:list itemType="xs:dateTime"/>
     </xs:simpleType>
     ..
     <xs:simpleType>
        <xs:list itemType="xs:string"/> 
     </xs:simpleType>
   </xs:union>
 </xs:simpleType>

Time duration data types

Data types for duration can be numeric, or of the XML data type xs:duration, or user-defined strings.

 <xs:simpleType name="time.duration.simpleType" final="#all">
   <xs:union>
     <xs:simpleType>
        <xs:restriction base="xs:decimal"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:integer"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:long"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:int"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:short"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:float"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:double"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:duration"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:string"/> 
     </xs:simpleType>
   </xs:union>
 </xs:simpleType>

Again we need a type for lists of durations.

 <xs:simpleType name="time.durations.simpleType" final="#all">
   <xs:union>
     <xs:simpleType>
        <xs:list itemType="xs:decimal"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:list itemType="xs:integer"/>
     </xs:simpleType>
     ...
     <xs:simpleType>
        <xs:list itemType="xs:duration"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:list itemType="xs:string"/> 
     </xs:simpleType>
   </xs:union>
 </xs:simpleType>

Time attributes and their interpretation

Lifetimes can be only unions of time intervals where the intervals can have length zero (i.e., are identical to points), can be bounded on both sides, or unbounded to the left or to the right (unbounded on both sides is the default and does not need any time attributes). Without recurrence it is only possible to specify finite unions of intervals. With recurrences it is possible to specify some infinite unions of intervals or to specify some finite unions in a more compact way.

Time attributes can be attached to any GraphML element. Technically, this is achieved by extending the attribute group common.extra.attrib, see below.

 <xs:redefine schemaLocation="http://graphml.graphdrawing.org/xmlns/1.2rc/graphml-structure.xsd">
   <xs:attributeGroup name="common.extra.attrib">
     <xs:attributeGroup ref="common.extra.attrib"/>
     <xs:attributeGroup ref="time.attributes"/>
   </xs:attributeGroup>
 </xs:redefine>

The attribute group time.attributes contains all attributes defined in this section.


Attributes to clarify the representation of lifetimes

Attributes time.point.type and time.duration.type explicitly denote the data types for time points or durations.

 <xs:attribute name="time.point.type" type="time.point.type.simpleType" use="optional"/>
 <xs:attribute name="time.duration.type" type="time.duration.type.simpleType" use="optional"/>

The referred simple types are enumerations of specific strings which are the names of the data types. For instance,

 <xs:simpleType name="time.point.type.simpleType" final="#all">
   <xs:restriction base="xs:string">  
     <xs:enumeration value="decimal"/>
     <xs:enumeration value="integer"/>
     ...
     <xs:enumeration value="dateTime"/>
     ...
     <xs:enumeration value="string"/> 
   </xs:restriction>
 </xs:simpleType>

and similar for durations.

If time points or durations are represented by user-defined strings, the GraphML document should specify a pattern (such as "yyyy-MM-dd'T'HH:mm:ssXXX")

 <xs:attribute name="time.point.pattern" type="xs:string" use="optional"/>
 <xs:attribute name="time.duration.pattern" type="xs:string" use="optional"/>

These attributes default (as everything else) to lower elements in the document tree unless overwritten.

Lifetimes without recurrence

Intervals of finite positive lenght

An interval can be specified in two ways: by a start and end time or by a start time and a duration. Thus, the time attributes can include the following three attributes.

 <xs:attribute name="time.interval.start" type="time.point.simpleType" use="optional"/>
 <xs:attribute name="time.interval.end" type="time.point.simpleType" use="optional"/>
 <xs:attribute name="time.interval.length" type="time.duration.simpleType" use="optional"/>

If no other time attributes are given, these attributes specify as lifetime the interval from the start time to either the end time or to the start time plus the duration (in case of calendar time adding durations to time is specified by the XML Schema specification; in the case of numeric times it is obvious). By default the time point specified as the start time is included in the lifetime and the time point specified by the end time or by the start time plus the duration is excluded. This can be overwritten (see below).

Durations can be negative. In this case the interval is closed to the right and open to the left.

To specify the interval as the lifetime, must be later than . Otherwise (if is earlier than ), the two attributes specify two non-overlapping semi-unbounded intervals (see below).

Durations equal to zero and end times equal to start times are disrecommended since intervals of zero lenght should be specified by the time.point attribute (see below).

Use of time.interval.end and time.interval.length in the same element is disrecommended; parsers might choose which of these elements they choose.

Time points (intervals of lenght zero)

To spefify a time point as the lifetime, use the following attribute

 <xs:attribute name="time.point" type="time.point.simpleType" use="optional"/>

Use of time point together with a time interval (for instance, specified by start and end times) is possible; the lifetime is then the union of a point with an interval.

Note that some data types for time points have a duration; and some are recurrent.

Semi-unbounded intervals

If a start time is not followed by an end time (later), the lifetime is the semi-unbounded interval starting at the start time (inclusive by default) and ending at plus infinity (exclusive).

If an end time is not preceeded by a start time (earlier), the lifetime is the semi-unbounded interval starting at minus infinity (exclusive) and ending at the end time (exclusive by default).

Including / excluding interval borders

Defaults for including or excluding interval borders can be overwritten by the attributes

 <xs:attribute name="time.left.inclusive" type="xs:boolean" use="optional"/>
 <xs:attribute name="time.right.inclusive" type="xs:boolean" use="optional"/>

Infinity or minus infinity is never included.

Unions of intervals or time points

It is possible to store lists of time points and intervals in the following attributes.

 <xs:attribute name="time.points" type="time.points.simpleType" use="optional"/>
 <xs:attribute name="time.intervals.start" type="time.points.simpleType" use="optional"/>
 <xs:attribute name="time.intervals.end" type="time.points.simpleType" use="optional"/>
 <xs:attribute name="time.intervals.length" type="time.durations.simpleType" use="optional"/>

The lifetime of the corresponding element is then the union of these points and/or intervals.

The same comments as when declaring just one interval or point apply. Additionally:

The lists time.intervals.start and time.intervals.end, respectively time.intervals.length, must have the same length and the list items are matched by order.

It is possible to declare additionally to a list of intervals one or two more by use of time.interval.start and/or time.interval.end. This is only recommended for declaring a union of one semi-unbounded interval with other (bounded) intervals.

It is recommended (but not required) that the intervals are non-overlapping and in ascending order.

Recurrent lifetimes

Recurrence allows to specify that lifetimes (defined by the attributes above) repeat indefinitly or a specified number of times or until an end date/time.

Recurrence (general thoughts)

Recurrence can be very simple if the recurrence intervals are just equidistant. For instance, if for numeric times the lifetime is just repeated after a number (representing a duration) or if for calendar time the lifetime is repeated after a fixed given duration, such as every month, every 2 years, every five days and 12 hours, etc. Note that, e.g., "every month" is considered as equidistant event though the interval has not a constant number of days, etc.

Recurrence can be more complicated for calendar times if it is something like "every last Wednesday in any month", or the first day in any month that is from the set of weekdays Monday, Tuesday, Wednesday, Thursday, Friday (which would be something like "the first workday in any month" if there were no holidays). iCalendar can specify such recurrence rules (see the specification of iCalendar here http://tools.ietf.org/html/rfc5545 or an XML version of it here http://tools.ietf.org/html/rfc6321) - but it cannot specify such a simple recurrence as "every Easter-Sunday in any year" (at least I could not find this - but if it were possible then it would still be missing something like "every day during Ramadan in any year" etc).

What we could do (to decide)

We should at least allow equidistant recurrence - possibly with equidistant exceptions. This is very simple to specify. For more sophisticated recurrence patterns we could do any of the following.

  1. Not allow non-equidistant recurrence at all.
  2. Mimick the recurrence rules from iCalendar by specifying corresponding time attributes.
  3. Tell the GraphML users that they should use the XML version of iCalendar (xCal) if they want to specify such recurrence patterns. Since xCal specifies the recurrence by XML elements rather than by attributes these recurrence rules have to be stored in subelements.
  4. Tell the GraphML users that they are free to extend GraphML (this holds anyway) - in particular to use more sophisticated time information in any established or newly proposed format. More sophisticated time information is not restricted to recurrence but can be anything. We have to point this out anyway - no matter how we treat recurrence. (I - Juergen - favor this last option.)

Recurrence for equidistant intervals

Uses (some of) the following attributes.

 <xs:attribute name="time.recur" type="xs:boolean" use="optional"/>
 <xs:attribute name="time.recur.left" type="xs:boolean" use="optional"/>
 <xs:attribute name="time.recur.right" type="xs:boolean" use="optional"/>
 <xs:attribute name="time.recur.interval" type="time.duration.simpleType" use="optional"/>
 <xs:attribute name="time.recur.count" type="xs:positiveInteger" use="optional"/>
 <xs:attribute name="time.recur.until" type="time.point.simpleType" use="optional"/>
 <xs:attribute name="time.recur.since" type="time.point.simpleType" use="optional"/>

The boolean attributes define whether there is recurrence at all: time.recur means recurrence to the left and to the right (past and future); time.recur.left means that recurrence starts in the past and extends to the lifetime that is explicitly defined; time.recur.right from the explicit lifetime into the future.

The duration after which the lifetime is repeated is given in time.recur.interval; this must be specified if there is recurrence at all. If neither count nor since nor until is given then the lifetime repeats indefinitly (to the left, to the right, or to both sides).

The number of repetitions can be bounded by time.recur.count. If the recurrence is to both sides and a count is given, then the lifetime is repeated count-many times starting from the explicit lifetime to the right. The attribute time.recur.since defines the start of the recurrence (inclusively). If since is specified: for left recurrence, the lifetime is repeated no further than to the explicit lifetime and not more often than count many times; for right recurrence, since has no effect; for recurrence to both sides, the lifetime is repeated no further than to the until time and not more often than count many times (indefinitelly if not restrictions are imposed). The attribute time.recur.until marks an end for the recurrence; meaning similar to since. Use of count, since, and until is an overspecification (but can also be treated by taking the lower number of repetitions).

More sophisticated recurrence in calendars

Exception in a recurrence

Similar to recurrence - just defines when the lifetime is not repeated. Use attributes

 <xs:attribute name="time.exrecur.interval" type="time.duration.simpleType" use="optional"/>

etc.

Other

Use an attribute time.stamp (of type time.point.simpleType) to denote the point in time when the element has been added to the graph / set of graphs. (Might be relevant in some situations and is different from the lifetime of the element.) Also maybe time.last.modified.

Time offset and time unit defines an affine invertible mapping from numeric time to a calendar timeline.

 <xs:attribute name="time.offset" type="time.point.simpleType" use="optional"/>
 <xs:attribute name="time.unit" type="time.duration.simpleType" use="optional"/>

Time is explicit if the lifetime of every element can be determined only from the attributes of this element (without having to look at any other element). This may be declared in the GraphML document to help parsers.

 <xs:attribute name="time.explicit" type="xs:boolean" use="optional"/>

Panel-data example

 <?xml version="1.0" encoding="UTF-8"?>
 <graphml xmlns="http://graphml.graphdrawing.org/xmlns/graphml" 
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
          xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/graphml 
                              http://graphml.graphdrawing.org/xmlns/1.2/graphml.xsd"
          xmlns:xs="http://www.w3.org/2001/XMLSchema"   
          time.point.type="xs:int"
          time.duration.type="xs:int">
   <key attr.description="five-point scale on smoking" attr.name="smokes" attr.type="int" for="node" id="d0"/>
   <graph edgedefault="directed" id="G" time.interval.start="1" time.interval.length="4">
     <node id="n0" time.interval.start="1" time.interval.length="3">
       1
       3
       2
     </node>
     <node id="n1">
       4
       0
     </node>
     <edge id="e0" source="n0" target="n1" time.interval.start="2" time.interval.length="2"/>
   </graph>
 </graphml>

Event-data example

 <?xml version="1.0" encoding="UTF-8"?>
 <graphml xmlns="http://graphml.graphdrawing.org/xmlns/graphml" 
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
          xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/graphml 
                              http://graphml.graphdrawing.org/xmlns/1.2/graphml.xsd"
          xmlns:xs="http://www.w3.org/2001/XMLSchema"   
          time.point.type="xs:string"
          time.point.pattern="yyyy-MM-dd'T'HH:mm:ssXXX">
   <graph edgedefault="directed" id="G">
     <node id="n0" time.interval.start="2009-01-09T16:47:07+01:00" />
     <node id="n1" time.interval.start="2009-07-23T01:24:51+01:00" />
     <edge id="e0" source="n0" target="n1" time.point="2009-07-23T01:24:51+01:00"/>
     <edge id="e1" source="n1" target="n0" time.point="2009-07-23T01:47:23+01:00"/>
   </graph>
 </graphml>

Another example

 <?xml version="1.0" encoding="UTF-8"?>
 <graphml xmlns="http://graphml.graphdrawing.org/xmlns/graphml" 
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
          xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/graphml 
                              http://graphml.graphdrawing.org/xmlns/1.2/graphml.xsd"
          xmlns:xs="http://www.w3.org/2001/XMLSchema"   
          time.duration.type="xs:duration"
          time.point.type="xs:string"
          time.point.pattern="yyyy-MM-dd'T'HH:mm:ssXXX">
   <key attr.description="five-point scale on smoking" attr.name="smokes" attr.type="int" for="node" id="d0"/>
   <key attr.description="average frequency of contact per month" attr.name="tie weight" attr.type="double" for="edge" id="d1"/>
   <key attr.description="work in the same company" attr.name="co-worker" attr.type="boolean" for="edge" id="d2"/>
   <graph edgedefault="directed" id="G" >
     <node id="n0" time.interval.start="2009-01-09T16:47:07+01:00" time.interval.length="P2Y7M20DT12H53M22.34S">
       1
       3
     </node>
     <node id="n1">
       0
       4
     </node>
     <edge id="e0" source="n0" target="n1">
       3.1
       false
       true
     </edge>
   </graph>
 </graphml>