Friday, June 20, 2014

JSR 363 Unit of Measurement API - a FIrst Glance

JSR 363 Unit of Measurement API - a First Glance

During the last Hackergarten in Zurich I had the chance to look at the current GitHub Repository of JAR 363. Hereby I focused on the API only. This JSR is quite young, nevertheless there is already a code base available, which was originally forked from https://code.google.com/p/unitsofmeasure/ . My objective hereby was relatively simple: look at the main abstractions and get a gut feeling on it, so here it is:

Main Concepts

One main artifact is the Quantity interface, which is defined as follows:

public interface Quantity<Q extends Quantity<Q>> 
extends Measurement<Q, Number> {
  Quantity<?> divide(Quantity<?> that);
}
The according JavaDoc describes this interface withRepresents a quantitative properties or attributes of thing. Mass, time, distance, heat, and angular separation are among the familiar examples of quantitative properties.

As we have seen this interface hereby extends Measurement:

public interface Measurement<Q extends Quantity<Q>, V>
extends UnitSupplier<Q>,
        ValueSupplier<V>,
        ConversionOperator<Measurement<Q, V>, Unit<Q>> {

  Measurement<Q, V> add(Measurement<Q, V> that);
  Measurement<Q, V> substract(Measurement<Q, V> that);
  Measurement<?, V> multiply(Measurement<?, V> that); 
  Measurement<Q, V> multiply(V that); 
  Measurement<Q, V> divide(V that);
  Measurement<Q, V> inverse(); 
}
Whereas Supplier are common concepts also known from other APIs, Additionally ConversionOperator, Measurement and Unit:must be explained:
// equivalent to @FunctionalInterface
public interface ConversionOperator
<T, U> {
  T to(U unit);
}
public interface Unit<Q extends Quantity<Q>>
extends UnitTransformer<Q>, Nameable {
  String getSymbol(); 
  Dimension getDimension();
  Unit<Q> getSystemUnit(); 
  Map<? extends Unit<?>, Integer> getProductUnits();
  boolean isCompatible(Unit<?> that);
  <T extends Quantity<T>> Unit<T> asType(Class<T> type)
    throws ClassCastException;
  UnitConverter getConverterTo(Unit<Q> that)    
    throws UnconvertibleException;
  UnitConverter getConverterToAny(Unit<?> that)
    throws IncommensurableException, UnconvertibleException;
  Unit<Q> alternate(String symbol);
  Unit<Q> shift(double offset);
  Unit<Q> multiply(double factor);
  Unit<?> multiply(Unit<?> that);
  Unit<Q> divide(double divisor);
  Unit<?> divide(Unit<?> that);
  Unit<?> inverse(); Unit<?> root(int n); 
  Unit<?> pow(int n); 
}
 
But we still have not everything covered, there is also a Dimension, modelling the algorithmic relations between units:

public interface Dimension {
  Dimension multiply(Dimension that);
  Dimension divide(Dimension that);
  Dimension pow(int n);
  Dimension root(int n);
  Map<? extends Dimension, Integer> getProductDimensions();
}
Summarizing we have the following main artifacts involved:
  • A Quantity, which describes the flavor or type of measurement, e.g. Length, Weight or Volume.
  • A Unit, which defines a concrete unit of a specific measurement, e.g. Nanometer, Tons, Gallons..
  • A concrete Measurement, which can be seen as a concrete value measured of a Unit, e.g. 10 Meters, 3 tons or 4 Gallons. Measurements also provide simple arithmetics for adding, subtracting, division and multiplication with other measures of compatible units.
  • Dimension defines the numeric relationships between a set of related units.

My Feedback

Basically defining an API for all measures and units available is a tough thing. Also there was already a lot of work done, and I am looking forward how this JSR will evolve. Nevertheless my first impression is that the API mixes up a few concerns that I think should be separated:
  • I would expect to have such general and defining concepts like quantities and units being separated from more concrete aspects like measurements and dimensions. 
  • Another element that was present on the former measurement API was the SystemOfUnits concept. or some kind of accessor, which acts as an entry point. I think the API requires some entry point, e.g. in form of a singleton to make it complete. Basically a programmer should be able to program against the API without having to know any details on the implementations.
  • I would like the quantity to be modeled as the more general but basically independent concept and let Unit refer to its corresponding Quantity. This would resolve the circular dependency between Quantity and Unit.
  • Furthermore I would really ensure that the Quantity and Unit types are always evaluable at runtime (be aware of type erasure). 
  • For me it is also arguable if we really have enough benefits of Nameable. If the idea was to provide functional support for getName(), maybe it makes sense, but I would try to reduce the number of artifacts needed to the minimum. Just start small and if you really have to add something add it later. 
  • I do not like the idea of adding arithmetic operations on units, since I concern units as well as quantities as the metadata level, similar to currencies in the money world. I would expect them to be on measurements, which for me would be the same concept as monetary amount.
  • FInally I think the naming of system unit and product units is confusing. I would prefer more something like a main or leading unit, and derived units.(or somebody can tell me, that the semantics is different here, I am not sure, if I understood these concepts fully here...).
So I would come up with something like:
  public interface Quantity {
   String getName();
   boolean isCompatible(Unit<?> that); // needed here ?
  }
  public Unit<Q extends Quantity>{
   Class<Q> getQuantity();
   Unit<Q> getLeadingUnit();
   Collection<Unit<Q>> getDerivedUnits();
   boolean isCompatible(Unit<Q> that);
  }
The management of quantities and units, including the mappings of units, quantities I would handle as a separate concern, e.g. something similar like

 public final class Quantities {
   public static Collection<Quantity> getQuantities();
   public static Collection<String> getQuantityNames();
   public static Quantitiy getQuantity(String name);
   public static <Q extends Quantity> Q getQuantity(
                                            Class<Q> type);
   public static boolean isCompatible(Quantity quantity,
                                      Unit<?> that);
 }
 public final class Units{
   public static <Q extends Quantity> 
      Collection<Unit<Q>> getUnits(Quantity);
   public static <Q extends Quantity>
      Collection<Unit<Q>> getLeadingUnits(Q quantity);
   public static <Q extends Quantity> 
      Collection<Unit<Q>> getDerivedUnits(Unit<Q> unit);
   public static boolean isCompatible(Quantity quantity,
                                      Unit<?> that);
   public static <Q extends Quantity>
      boolean isDerived(Unit<Q> mainUnit, Unit<Q> derivedUnit);
   public static <Q extends Quantity>
      boolean isLeadingUnit(Unit<Q> unit);
   public static <Q extends Quantity>
      boolean isLeadingUnit(Unit<Q> unit);
   public static SystemOfUnits getSystemOfUnits(String name);
   public static SystemOfUnits getSystemOfUnits(Locale country);
   public static Collection<SystemOfUnits> getSystemOfUnits(
                                              Unit<?> unit);   public static Collection<SystemOfUnits> getSystemOfUnits(
                                              Unit<?> unit);
   public static Collection<String> getSystemOfUnitNames();
   public static SystemOfUnits getDefaultSystemOfUnits();
 }
public final class Dimensions{ ... }
Compared to what we have above this is much more simpler and comprehensive. Additionally there are also more advantages:
  • All these singleton can be backed up by spis. With this it should be easily possible to support partial implementations and overrides. E.g. when an additional unit type is defined, e.g. for measuring black materia in super-light speed mode (e.g. "WARPS") it would be possible to implement the parts required to support this additional unit.
  • Basically such a structure could allow also additional functionality such as contextual behaviour in a multi-tenancy/ee context.
Given an artifact SystemOfUnits it probably would be possible to support accessing units by their literal symbols, e.g.
 public interface SystemOfUnits{
   <Q extends Quantity> 
     Collection<Unit<Q>> getLeadingUnits(Quantity quantity);   <Q extends Quantity> Unit<Q> getUnit(String symbol);
   ...
 }
 
Now let us focus again on the next artifact: Measurement. Here
  • I would reduce the arithmetic methods to the absolute minimum. Instead of having operations on it like shift, alternate I would prefer an extension mechanism, similar to MonetaryQuery or MonetaryOperator in JSR 354.
  • I would heavily recommend to separate the concern of conversion. Basically this might also be doable based on the extension mechanism above, or it can be modeled as a separate API, but also provided by a separate accessor singleton.
  • Finally I would like to see, if we really need alternate values than decimal numbers to represent the values of measurements. Even if there are rare cases, where measures are not modeled as decimal numbers, I would see, if it is possible to map them nevertheless somehow to decimal numbers in a feasible way. If, I hope so, most of the measurements can be modeled in a unified way, I would tend to take this as the leading use case. For speciayou can still have an additional complex measurement model. This would IMO make the API simpler for the majority of use cases, which I think is better than trying to include also edge cases that pollute the API at the end (making more difficult to use).
So summarizing I would propose something like the following:

public interface Measurement<Q extends Quantity, 
                             U extends Unit<Q>>
extends UnitSupplier<U>, ValueSupplier<Number>{
  Measurement<Q, U> add(Measurement<Q, ?> that);
  Measurement<Q, U> subtract(Measurement<Q, ?> that);
  Measurement<Q, U> multiply(Measurement<Q, ?> that);
  Measurement<Q, U> multiply(double that);
  Measurement<Q, U> multiply(long that);
  Measurement<Q, U> multiply(Number that);
  Measurement<Q, U> divide(double that);
  Measurement<Q, U> divide(long that);
  Measurement<Q, U> divide(Number that);
  Measurement<Q, U> inverse();
  Measurement<Q, U> pow(int n);
  Measurement<Q, U> with(MeasurementOperator op);
  <T> T query(MeasurementQuery<T, Q> query);
}
, whereas
  public interface MeasurementOperator<Q extends Quantity, 
                                       U extends Unit<Q>>{
      <T extends Measurement<Q, U>> T apply(T unit);
  }
  public interface MeasurementQuery<Q extends Quantity,
                                       U extends Unit<Q>, R>{
      R query(U unit);
  }
Also I would definitively recommend handling formatting and conversion as separate concerns, but I will stop here for now. I think the JSR's Expert Group has enough input for discussion and I hope, they will catch up at least some of my concerns, so we get a simple, comprehensive, easy to use, but nevertheless powerful measurement API.
Of course, as always, any comments are welcome!

11 comments:

  1. Thanks for the input. During the meeting with JSR 363 Co Spec Lead Jean-Marie Dautelle he confirmed the intent to keep the API free from concrete classes, so suggested factories or facade classes like Units, etc. would be up to implementations, but we see almost no place for them in the API.
    Where such factories like "Dimensions" are considered we clearly prefer a (OSGi friendly) Service concept, so instead of a final concrete class "Dimensions" the "service" package of the API (if you want the SPI, that name is something we may use if seen more appropriate) already declares a DimensionService and similar ones. The Java ServiceLoader mechanism is compatible with OSGi (http://blog.osgi.org/2013/02/javautilserviceloader-in-osgi.html) and on devices that support it in ME 8 the same ServiceLoader can also be used on ME, but like other parts of the API this is optional, so it's up to implementing platforms whether they need it or not.

    Arithmetic operations on Unit or Dimension allow multiples and submultiples, things like KILO(GRAM) or NANO(METRE) are not operations on a measurement or quantity, but the unit. Occasionally, also by other languages (e.g. Sun's Fortress language used the notion of "Dimension and Units") but every design and expression of the matter as well as prior APIs have a Quantity contain a Unit (see http://en.wikipedia.org/wiki/Physical_quantity#Units_and_dimensions) not the other way round as above suggests.

    Aside from that Martin Fowler was probably the one next to Andrew Kennedy (creator of Unit API for F# based on his academic thesis and papers) who analysed and described the Quantity Pattern some time ago: http://martinfowler.com/eaaDev/quantity.html

    The relation between Quantity and Measurement (a term, some APIs call Measure, others, especially OSGi Measurement have the very same type) can be best explained by Wikipedia:

    >A physical quantity (or "physical magnitude") is a physical property of a >phenomenon, body, or substance, that can be quantified by measurement.
    Guess "Magnitude" would be among the few other terms applicable here. Leading research institutions like CERN reviewed the approach prior to JSR proposal and pointed out, the generic type (with Number being a common, but not exclusive special case) is something they are happy to see match what they're doing in this area, too. Allowing them to consider adopting such a standard more easily.

    Functional Interfaces like Nameable, UnitSupplier, etc. aim to help Java 8 Lambda usage, but unlike JSR 354 baking Lambdas or Java SE 8 types into the API is strictly forbidden, as it would break Java ME 8 compatibility. The API is portable and has to be. the RI will be based on Java ME 8, which also makes it compatible with prior Java SE versions, at least SE 7. The "forward-port" of an implementation to Java SE 8 with Lambdas and other new features exists, but it is not the RI, given the IoT focus one requirement for this JSR by Embedded first. If a lambda usage of some functional interfaces is not met, they could be merged into the main interface in some cases.

    We'll discuss the suggestions in more detail, maybe in an EG call, but in some cases, structural design was inspired by widely respected experts from Andrew Kennedy to Martin Fowler, and not just made up from scratch or "reinvented" by the EG, so in such cases, we had a good reason to listen to their opinion already;-)

    Cheers,
    Werner

    ReplyDelete
  2. Hi Werner
    My key point is that I see basically the folloeing main levels of abstraction:
    * The Quantity, which defines the "thing" measured, such as length, weight or electrical power. (Also note that this view of 'Quantity' does not match the view of Martin Fowler, he is quite nearer to Measurement).
    * Units (including sub- and super-units that are able to express a quantity. is a separate concern to manage the different mappings between different units of the same quantity, e.g. nanometer, kilometer, miles and a the distance light is doing in one year. Similarly it would be much more simpler to define a comprehensive meta-data API, since this API can then primarly focus and returning these kind of meta-informations and not be mixed up with a concrete unit.
    Of course, I can add all this type of information on the Unit as well, but as I said, it makes Unit much more complicated to understand and implement. For me this is an obvious trade-off dependent of these concerns are separated or not. Semantically the API as a whole should be able to cover similar aspects.
    * Measurements (Measures) that combine a concrete unit with a effective value (mostly a number). That it should be possible to combine different measures (of the same quantity) makes sense. Martin Fowler also takes that up for addition, subtraction, multiplication and division. He takes into account the variants where a division is also allowed with a non scalar, resulting basically in a new quantity. I agree on that as well, since this is the level where most users are dealing with and it is comfortable if conversion is done implicitly, if possible, But again the information, what and how conversion should be done IMO should not be part on the measurement interface itself, but reside elsewhere.
    * Metadata Information which defines relationships between the stuff above.

    So I do not basically think on different abstractions as others do. But I like concerns to be separated (especially on meta level), because this makes things clearer.

    Also I think a SE/ME API should be complete. A user should be able to program against a JSR (this also will help you implementing a good TCK). So you will end up with just a few accessor classes, but thinking users will use the ServiceLoader IMO is not an option (nevertheless you must define the services to be accessed ;-) ). Just thinking an API must not contain classes makes no sense, keeping them minimal I am with you (btw exceptions also are classes ;-( ). So as I said take it positively and keep me on track, when I should have a look at the API again before EDR...

    Cheers,
    Anatole

    ReplyDelete
  3. It will be very difficult to assess the real world usage of this API without a skeleton of implementation classes, so it would be good to see some of that scaffolding put in place so that folks can see how easy (or not) it to perform common tasks with Units & Measurements.

    I'd love to see more of the scientific community who need to work with these concepts day to day, weigh in on this proposed JSR. I for one am definitely not an expert, so I'll leave my feedback as is :-).

    ReplyDelete
  4. Experts in the EG (especially fellow Spec Leads) expressed their aim to keep the API interface only. And allow concrete service implementations to be done in a platform specific way. One implementation may go for OSGi while another one especially for ME like the RI needs to use the Service Loader mechanism available there. Most JSRs make this clear separation between API and RI. JMS 2.0 where we met the Spec Lead just a few weeks ago in London is one of the clearest examples beside e.g. CDI. It contains 2 concrete classes that are not exceptions. A number somewhat similar to the current 363 API as there are one or two classes like Range that are "scaffolding" if you want, or the TimeSeries class. Especially that is on the other hand worth discussing if it could go into implementations. On the API level the time stamp for that can't exceed what is available in Java ME, thus either long or types in java.util, that are ME compliant. If somebody wanted to use other Date/Time APIs of the 3 or 4 in SE 8, then that would only be acceptable in an SE implementation. Aside from JMS a JSR with quite a lot of implementations is JSF.

    ReplyDelete
    Replies
    1. We considered placing concrete unit systems like SI into the API in JSR-275, with the result that many EC Members found this irritating and it was among reasons why they voted against it then. The fact that many of them are no longer in the EC doesn't mean they were wrong or their concerns won't matter today. Even abstract base classes like AbstractQuantity or AbstractUnit, etc. differ slightly when you take the difference between SE and ME. The RI is based on Java ME, allowing the greatest possible number of devices to use it. All implementing frameworks and projects from JScience to Eclipse UOMo or GeoAPI intend to use the new JSR, plus the 2 RI and if you want "OpenJDK port" that makes 5 or more implementations. Not all JSRs have more than 1 or 2 implementations, even for Unit-API 0.6 if you count the Opower fork of JScience 5 there are 6 there, too, one closed source at BT, all others are open source. So the scientific community and related fields like "Smart Grid" already use it now. And even the old EC "disregarded" JSR 275 is used by projects like Parfait, Humanizer (a framework to make data including that by Joda more "Human Readable and Understandable;-) and embedded within the current GeoAPI standard used by Apache SIS or Eclipse projects in LocationTech like uDig.

      Especially all those downstream projects and (OGC) standards and their implementations mean a responsibility to provide new features or concepts where it makes sense, but not entirely break the code of all these users forcing them to completely rewrite their apps;-) Some in need of BigDecimal values may prefer an SE implementation (there should be at least 3 of them for different environments from Real Time enabled JScience/Javolution to Eclipse and the SE 8 one close to OpenJDK) while users of JSR 256 or things like OSGi Measurement (http://www.osgi.org/javadoc/r4v42/org/osgi/util/measurement/Measurement.html) will be fine with the RI, Their old solution was only using double primitives after all;-)

      The Measurement class of OSGi Measurement is among the inspirations why Measurement contains operations. If based on the fact the value can be any useful type, often non-numeric, we find arithmetic operations better one level deeper at Quantity, that is something to consider. It would limit these methods to Quantity, but if you define Measurement you probably won't have that much use for traditional math operations there. Unlike Fowler's more "academic" examples, OSGi Measurement has plenty of operations, most importantly things like Measurement.mul(Measurement) which for let's say (1 METRE).multiply(1 METRE) makes perfect sense, it'll return 1 m², that's something Money or Time probably won't have, but other quantities do.
      Between next month's Developer Week and JavaZone we expect to go EDR. Allowing a closer look at the API and RI plus where available also initial TCK tests.

      Cheers,
      Werner

      Delete
    2. Last but not least, ICU4J 52.x and above (currently 53.1) contains all aspects of Unit-API except conversion or arithmetics (which Mark said they just don't do in a format/parse/UI framework)
      The MeasureFormat JavaDoc provides nice examples:
      http://icu-project.org/apiref/icu4j/com/ibm/icu/text/MeasureFormat.html

      ICU4J follows the constructor pattern, so we rarely see a getInstance() there for elements like format or managers, but no of() or valueOf().
      Aside from that it is very similar, closest to UOMo which is based on an earlier version of ICU4J and follows some of its behavior.

      The latest version introduces more interfaces, e.g. a MeasureUnit.Factory. And remarkably similar to the UnitTransformer interface a more generic version by Mark here: http://icu-project.org/apiref/icu4j/com/ibm/icu/text/Transform.html

      Werner

      Delete
    3. If you only want to have interfaces in your API even for value types, fair enough. But I still would appreciate to have one small singleton, which acts as an entry point for the different services. If you implement it using OSGI, CDI, ServiceLoader, Spring or your private full blown multi capable allround component container is fully transparent. That would require one single class and one SPI, loadable through service loader. That still would be ME compatible, but would make your API complete, so code written in ME should be basically compatible with ME without creating implementation dependencies. Think on that, I think it is worth the price ;-)

      Delete
  5. Within RHQ (http://jboss.org/rhq) we also use measurements with units and have helpers that allow to "rightscale" measurement values to e.g. display units. Take the example of 12.413.462.345 bytes, which is rarely usable for human consumption. Here a display value of "12.2 GB" would be right. Similar for the other way around "0.001s" could be "rightsized" as "1ms".
    I think it would be good if JSR 363 would also contain such mechanisms

    ReplyDelete
  6. An interesting point (also the RHQ project, I believe I looked at it in a DevOps project, thus some of it could be of interest to Anatole's Config JSR;-) we're happy to look into. Does it do this approximation while formatting or with the actual data? Twitter does something similar when showing number of Tweets, and (that's something we'd also have to consider) started doing so for multiple languages now, too. CLDR/Unicode could be useful, as mentioned in the JSR proposal. Beside, there is at least one other Java monitoring tool which Red Hat uses and AFIK contributes to (at least the Performance CoPilot Umbrella) Parfait: https://code.google.com/p/parfait/
    It uses JSR-275 and they would be interested to migrate to 363 at some point.

    ReplyDelete
  7. Dies RHQ also have a JIRA or similar issue tracker? I found a bug in the unit definitions but can't see the project in issues.jboss.org.

    ReplyDelete
  8. this is such a great blog thanks for sharing and posting this great article .
    العاب دراغون بول

    ReplyDelete