Scratch file use in HPC @ORNL, a statistical analysis

Attended SC17 (Supercomputing Conference) this past week and I received a copy of the accompanying research proceedings. There are a number of interesting papers in the research and I came across one, Scientific User Behavior and Data Sharing Trends in a Peta Scale File System by Seung-Hwan Lim, et al from Oak Ridge National Laboratory (ORNL) and the use of files at the Oak Ridge Leadership Computing Facility (OLCF) which was very interesting.

The paper statistically describes the use of a Scratch files in a multi PB file system (Lustre) at OLCF from January 2015 to August 2016. The OLCF supports over 32PB of storage, has a peak aggregate of over 1TB/s and Spider II (current Lustre file system) consists of 288 Lustre Object Storage Servers, all interconnected and connected to all the supercomputing cluster of  servers via an InfiniBand network. Spider II supports all scratch storage requirements for active/queued jobs for the Titan (#4 in Top 500 [super computer clusters worldwide] list) and other clusters at ORNL.

ORNL uses an HPSS (High Performance Storage System) archive for permanent storage but uses the Spider II file system for all scratch files generated and used during supercomputing applications.  ORNL is expecting Spider III (2018-2023) to host 10 billion files.

Scratch files are purged from Spider II after 90 days of no access.The paper is based on metadata analysis captured during scratch purging process for 500 days of access.

The paper displays a number of statistics and metrics on the use of Spider II:

  • Less than 3% of projects have a directory depth >15, the maximum directory depth was recorded at 432, with most projects having a shallow (<10) directory depth.
  • A project typically has 10X the files that a specific researcher has and a median file count/researcher is 2000 files with a median project having 20,000 files.
  • Storage system performance is actively managed by many projects. For instance, 20 out of 35 science domains manually managed their Lustre cluster configuration to improve throughput.
  • File count continues to grow and reached a peak of 1B files during the time being analyzed.
  • On average only 3% of files were accessed readonly, 10% of files updated (read-write) and 76% of files were untouched during a week period. However, median and maximum file age was 138 and 214 days respectively, which means that these scratch files can continue to be accessed over the course of 200+ days.

There was more information in the paper but one item missing is statistics on scratch file size distribution a concern.

Nonetheless, in paints an interesting picture of scratch file use in HPC application/supercluster environments today.


Crowdresearch, crowdsourced academic research

Read an article in Stanford Research, Crowdsourced research gives experience to global participants that discussed an activity in Stanford and other top tier research institutions to try to get global participation in academic research. The process is discussed more fully in a scientific paper (PDF here) by researchers from Stanford, MIT Media Lab, Cornell Tech and UC Santa Cruz.

They chose three projects:

  • A HCI (human computer interaction) project to design, engineer and build a new paid crowd sourcing marketplace (like Amazon’s Mechanical Turk).
  • A visual image recognition project to improve on current visual classification techniques/algorithms.
  • A data science project to design and build the world’s largest wisdom of the crowds experiment.

Why crowdsource academic research?

The intent of crowdsourced research is to provide top tier academic research experience to persons which have no access to top research organizations.

Participating universities obtain more technically diverse researchers, larger research teams, larger research projects, and a geographically dispersed research community.

Collaborators win valuable academic research experience, research community contacts, and potential authorship of research papers as well as potential recommendation letters (for future work or academic placement),

How does crowdresearch work?

It’s almost an open source and agile development applied to academic research. The work week starts with the principal investigator (PI) and research assistants (RAs) going over last week’s milestone deliveries to see which to pursue further next week. The crowdresearch uses a REDDIT like posting and up/down voting to determine which milestone deliverables are most important. The PI and RAs review this prioritized list to select a few to continue to investigate over the next week.

The PI holds an hour long video conference (using Google Hangouts On Air Youtube live stream service). On the conference call all collaborators can view the stream but only a select few are on camera. The PI and the researchers responsible for the important milestone research of the past week discuss their findings and the rest of the collaborators on the team can participate over Slack. The video conference is archived and available  to be watched offline.

At the end of the meeting, the PI identifies next weeks milestones and potentially directly responsible investigators (DRIs) to work on them.

The DRIs and other collaborators choose how to apportion the work for the next week and work commences. Collaboration can be fostered and monitored via Slack and if necessary, more Google live stream meetings.

If collaborators need help understanding some technology, technique, or too, the PI, RAs or DRIs can provide a mini video course on the topic or can point to other information used to get the researchers up to speed. Collaborators can ask questions and receive answers through Slack.

When it’s time to write the paper, they used Google Docs with change tracking to manage the writing process.

The team also maintained a Wiki on the overall project to help new and current members get up to speed on what’s going on. The Wiki would also list the week’s milestones, video archives, project history/information, milestone deliverables, etc.

At the end of the week, researchers and DRIs would supply a mini post to describe their work and link to their milestone deliverables so that everyone could review their results.

Who gets credit for crowdresearch?

Each week, everyone on the project is allocated 100 credits and apportions these credits to other participants the weeks activities. The credits are  used to drive a page-rank credit assignment algorithm to determine an aggregate credit score for each researcher on the project.

Check out the paper linked above for more information on the credit algorithm. They tried to defeat (credit) link rings and other obvious approaches to stealing credit.

At the end of the project, the PI, DRIs and RAs determine a credit clip level for paper authorship. Paper authors are listed in credit order and the remaining, non-author collaborators are listed in an acknowledgements section of the paper.

The PIs can also use the credit level to determine how much of a recommendation letter to provide for researchers

Tools for crowdresearch

The tools needed to collaborate on crowdresearch are cheap and readily available to anyone.

  • Google Docs, Hangouts, Gmail are all freely available, although you may need to purchase more Drive space to host the work on the project.
  • Wiki software is freely available as well from multiple sources including Wikipedia (MediaWiki).
  • Slack is readily available for a low cost, but other open source alternatives exist, if that’s a problem.
  • Github code repository is also readily available for a reasonable cost but  there may be alternatives that use Google Drive storage for the repo.
  • Web hosting is needed to host the online Wiki, media and other assets.

Initial projects were chosen in computer science, so outside of the above tools, they could depend on open source. Other projects will need to consider how much experimental apparatus, how to fund these apparatus purchases, and how a global researchers can best make use of these.

My crowdresearch projects

Some potential commercial crowdresearch projects where we could use aggregate credit score and perhaps other measures of participation to apportion revenue, if any.

  • NVMe storage system using a light weight storage server supporting NVMe over fabric access to hybrid NVMe SSD – capacity disk storage.
  • Proof of Stake (PoS) Ethereum pooling software using Linux servers to create a pool for PoS ETH mining.
  • Bipedal, dual armed, dual handed, five-fingered assisted care robot to supply assistance and care to elders and disabled people throughout the world.

Non-commercial projects, where we would use aggregate credit score to apportion attribution and any potential remuneration.

  • A fully (100%?) mechanical rover able to survive, rove around, perform  scientific analysis, receive/transmit data and possibly, effect repairs from within extreme environments such as the surface of Venus, Jupiter and Chernoble/Fukishima Daiichi reactor cores.
  • Zero propellent interplanetary tug able to rapidly transport rovers, satellites, probes, etc. to any place within the solar system and deploy theme properly.
  • A Venusian manned base habitat including the design, build process and ongoing support for the initial habitat and any expansion over time, such that the habitat can last 25 years.

Any collaborators across the world, interested in collaborating on any of these projects, do let me know, here via comments. Please supply some way to contact you and any skills you’re interested in developing or already have that can help the project(s).

I would be glad to take on PI role for the most popular project(s), if I get sufficient response (no idea what this would be). And  I’d be happy to purchase the Drive, GitHub, Slack and web hosting accounts needed to startup and continue to fruition the most popular project(s). And if there’s any, more domain experienced PIs interested in taking any of these projects do let me know.  


Picture Credit(s): Crowd by Espen Sundve;

Videoblogger Video Conference by Markus Sandy;

Researchers Night 2014 by Department of Computer Science, NTNU;

A steampunk Venusian rover

Read an article last week in theEngineer on “Designing a mechanical rover to explore … Venus“, on a group at JPL, led by Jonathon Sauder who are working on a mechanical rover to study Venus.

Venus has a temperature of ~470c, hot enough to melt lead, which will fry most electronics in seconds. Moreover, the Venusian surface is under a lot of pressure, roughly equivalent to a mile under water or ~160X the air pressure at Earth’s surface (from NASA Venus in depth). Extreme conditions for any rover.

Going mobile

Sauder and his team were brainstorming mechanical rovers, that operated similar to Theo Jansen’s StrandBeest which walks using wind energy alone. (Checkout the video of the BEEST walking).

Jansen had told Sauder’s team that his devices work much better on smooth surfaces and that uneven, beach like surfaces presented problems.

So, Sauder’s team started looking at using something with tracks instead of legs/feet, sort of like a World War 1 tank. That could operate upside down as well as rightside up.

Rather than sails (as the StrandBeest), they plan to use multiple vertical axis wind turbines, called Sarvonius rotors, located inside the tank to create energy and store that energy in springs for future use.

Getting data

They’re not planning to ditch electronics all together but need to minimize the rovers reliance on electronics.

There are some electronics that can operate at 450C based on silicon carbide and gallium carbide which have a very low level of integration at this time, just a 100 transistors per chip.  And they could use this to add electronic processing and control to their mechanical rover.

Solar panels can supply electricity to the high temperature electronics and can operate at 450C.

But to get information off the rover and back to the Earth, they plan to use a highly radio reflective spot on the rover and a mechanical shutter mechanism. The mechanism can be closed and opened and together with an orbiting satellite generating radio pulses and recording the rover’s reflectivity or not, send Morse code from rover to satellite. The orbiting satellite could record this information and then transmit it to Earth.

The rover will make use of simple chemical reactions to measure soil, rock and atmospheric chemistry. Soil and rocks suitable for analysis can be scooped up, drilled out and moved to the analysis chamber(s) via mechanical devices. Wind speed and direction can be sensed with simple mechanical devices.

In order to avoid obstacles wihile roving around the planet, they  plan to use a mechanical probe out othe front (and back?) of the rover with control systems attached to this to avoid obstacles. This way the rover can move around more of the planets surface.

Such a mechanical rover with high temperature electronics might also be suitable for other worlds in the solar system, Mercury for sure but moons of the Jovian planets, also have extreme pressure environments.

And such a electrical-mechanical rover also might work great to probe volcano’s on earth, although the temperatures are 700 to 1200C, ~2 to 3X Venus. Maybe such a rover could be used in highly radioactive environments to record information and send this back to personnel outside the environment or even effect some preprogrammed repairs. Ocean vents could also be another potential place where such a rover might work well.

Possible improvements

Mechanical probes would need to be moved vertically and swing horizontally to be effective and would necessarily have to poke outside the tanks envelope to read obstacles ahead.

Sonar could work better. Sounds or clicks could be produced mechanically and their reflections could be also received mechanically (a mic is just a mechanical transducer). At the pressures on Venus, sound should travel far.

Morse code was designed to efficiently send alpha-numerics and not much else. It would seem that another codec could be designed to send scientific information faster. And if one mechanical spot is good, multiple spots would be better assuming the satellite could detect multiple radio reflective spots located in close proximity to one another on the rover.

Radio works but why not use infrared. If there were some way to read an infrared signal from the probe, it could present more information per pass.

For instance, an infrared photo of the rover’s bottom or top, using with a flat surface, could encode information in cold and hot spots located across that surface.

This could work at whatever infrared resolution available from the satellite orbiting overhead and would send much more information per orbital pass.

In fact, such an infrared surface readout might allow the rover to send B&W pictures up to the satellite. Sonar could provide a mechanism to record a (sound) picture of the environment being scanned. The infrared information could be encoded across the surface via pipes of cool and hot liquids, sort of like core memory of old.

What about steam power. With 450C there ought to be more than enough heat to boil some liquid and have it cool via expansion. Having cool liquid could be used to cool electronics, chemical and solar devices.  And as the high temperatures on Venus seem constant, steam power and liquid cooling would be available all the time and eliminating any need for springs to hold energy.

And the cooling liquid from steam engines could be used to support an infrared signaling mechanism.

Still not sure why we need any electronics. A suitably configured, shrunken, analytical engine could provide the rudimentary information processing necessary to work the shutter or other transmitter mechanisms, initiate, readout and store mechanical/chemical/sonar sensors and control the other items on the rover.

And with a suitably complex analytical engine there might be some way to mechanically program it with various modes using something like punched tape or cards. Such a device could be used to hold and load information for separate programs in minimal space and could also be used to store information for later transmission, supplying a 100% mechanical storage device.

Going 100% mechanical could also lead to a potentially longer lived rover than something using some electronics and mostly mechanical devices on a planet like Venus. Mechanical devices can fail, but their failure modes are normally less catastrophic, well understood. Perhaps with sufficient mechanical redundancy and concern for tribology, such a 100% mechanical rover could last an awful long time, without any maintenance, e.g., like swiss watches.


Photo Credit(s): World War One tank – mark 1 by Photos of the Past

Vintage Philmor morse code practice … by Joe Haupt

Accompanied by an instructor… by vy pham;

Core memory more detail by Kenneth Moore;

Model of the Analytical Engine By Bruno Barral (ByB), CC BY-SA 2.5;

Punched tape by Rositslav Lisovy

Steam locomotives by Jim Phillips

Disk rulz, at least for now

Last week WDC announced their next generation technology for hard drives, MAMR or Microwave Assisted Magnetic Recording. This is in contrast to HAMR, Heat (laser) Assisted Magnetic Recording. Both techniques add energy so that data can be written as smaller bits on a track.

Disk density drivers

Current hard drive technology uses PMR or Perpendicular Magnetic Recording with or without SMR (Shingled Magnetic Recording) and TDMR (Two Dimensional Magnetic Recording), both of which we have discussed before in prior posts.

The problem with PMR-SMR-TDMR is that the max achievable disk density is starting to flat line and approaching the “WriteAbility limit” of the head-media combination.

That is even with TDMR, SMR and PMR heads, the highest density that can be achieved is ~1.1Tb/ The Writeability limit for the current PMR head-media technology is ~1.4Tb/ As a result most disk density increases over the past years has been accomplished by adding platters-heads to hard drives.

MAMR and HAMR both seem able to get disk drives to >4.0Tb/ densities by adding energy to the magnetic recording process, which allows the drive to record more data in the same (grain) area.

There are two factors which drive disk drive density (Tb/ Bits per inch (BPI) and Tracks per inch (TPI). Both SMR and TDMR were techniques to add more TPI.

I believe MAMR and HAMR increase BPI beyond whats available today by writing data on smaller magnetic grain sizes (pitch in chart) and thus more bits in the same area. At 7nm grain sizes or below PMR becomes unstable, but HAMR and MAMR can record on grain sizes of 4.5nm which would equate to >4.5Tb/

HAMR hurdles

It turns out that HAMR as it uses heat to add energy, heats the media drives to much higher temperatures than what’s normal for a disk drive, something like 400C-700C.  Normal operating temperatures for disk drives is  ~50C.  HAMR heat levels will play havoc with drive reliability. The view from WDC is that HAMR has 100X worse reliability than MAMR.

In order to generate that much heat, HAMR needs a laser to expose the area to be written. Of course the laser has to be in the head to be effective. Having to add a laser and optics will increase the cost of the head, increase the steps to manufacture the head, and require new suppliers/sourcing organizations to supply the componentry.

HAMR also requires a different media substrate. Unclear why, but HAMR seems to require a glass substrate, the magnetic media (many layers) is  deposited ontop of the glass substrate. This requires a new media manufacturing line, probably new suppliers and getting glass to disk drive (flatness-bumpiness, rotational integrity, vibrational integrity) specifications will take time.

Probably more than a half dozen more issues with having laser light inside a hard disk drive but suffice it to say that HAMR was going to be a very difficult transition to perform right and continue to provide today’s drive reliability levels.

MAMR merits

MAMR uses microwaves to add energy to the spot being recorded. The microwaves are generated by a Spin Torque Oscilator, (STO), which is a solid state device, compatible with CMOS fabrication techniques. This means that the MAMR head assembly (PMR & STO) can be fabricated on current head lines and within current head mechanisms.

MAMR doesn’t add heat to the recording area, it uses microwaves to add energy. As such, there’s no temperature change in MAMR recording which means the reliability of MAMR disk drives should be about the same as todays disk drives.

MAMR uses todays aluminum substrates. So, current media manufacturing lines and suppliers can be used and media specifications shouldn’t have to change much (?) to support MAMR.

MAMR has just about the same max recording density as HAMR, so there’s no other benefit to going to HAMR, if MAMR works as expected.

WDC’s technology timeline

WDC says they will have sample MAMR drives out next year and production drives out in 2019. They also predict an enterprise 40TB MAMR drive by 2025. They have high confidence in this schedule because MAMR’s compatabilitiy with  current drive media and head manufacturing processes.

WDC discussed their IP position on HAMR and MAMR. They have 400+ issued HAMR patents with another 100+ pending and 75 issued MAMR patents with 46 more pending. Quantity doesn’t necessarily equate to quality, but their current IP position on both MAMR and HAMR looks solid.

WDC believes that by 2020, ~90% of enterprise data will be stored on hard drives. However, this is predicated on achieving a continuing, 10X cost differential between disk drives and (QLC 3D) flash.

What comes after MAMR is subject of much speculation. I’ve written on one alternative which uses liquid Nitrogen temperatures with molecular magnets, I called CAMR (cold assisted magnetic recording) but it’s way to early to tell.

And we have yet to hear from the other big disk drive leader, Seagate. It will be interesting to hear whether they follow WDC’s lead to MAMR, stick with HAMR, or go off in a different direction.



Photo Credit(s): WDC presentation

A tale of two storage companies – NetApp and Vantara (HDS-Insight Grp-Pentaho)

It was the worst of times. The industry changes had been gathering for a decade almost and by this time were starting to hurt.

The cloud was taking over all new business and some of the old. Flash’s performance was making high performance easy and reducing storage requirements commensurately. Software defined was displacing low and midrange storage, which was fine for margins but injurious to revenues.

Both companies had user events in Vegas the last month, NetApp Insight 2017 last week and Hitachi NEXT2017 conference two weeks ago.

As both companies respond to industry trends, they provide an interesting comparison to watch companies in transition.

Company role

  • NetApp’s underlying theme is to change the world with data and they want to change to help companies do this.
  • Vantara’s philosophy is data and processing is ultimately moving into the Internet of things (IoT) and they want to be wherever the data takes them.

Hitachi Vantara is a brand new company that combines Hitachi Data Systems, Hitachi Insight Group and Pentaho (an analytics acquisition) into one organization to go after the IoT market. Pentaho will continue as a separate brand/subsidiary, but HDS and Insight Group cease to exist as separate companies/subsidiaries and are now inside Vantara.

NetApp sees transitions occurring in the way IT conducts business but ultimately, a continuing and ongoing role for IT. NetApp’s ultimate role is as a data service provider to IT.

Customer problem

  • Vantara believes the main customer issue is the need to digitize the business. Because competition is emerging everywhere, the only way for a company to succeed against this interminable onslaught is to digitize everything. That is digitize your manufacturing/service production, sales, marketing, maintenance, any and all customer touch points, across your whole value chain and do it as rapidly as possible. If you don’t your competition will.
  • NetApp sees customers today have three potential concerns: 1) how to modernize current infrastructure; 2) how to take advantage of (hybrid) cloud; and 3) how to build out the next generation data center. Modernization is needed to free capital and expense from traditional IT for use in Hybrid cloud and next generation data centers. Most organizations have all three going on concurrently.

Vantara sees the threat of startups, regional operators and more advanced digitized competitors as existential for today’s companies. The only way to keep your business alive under these onslaughts is to optimize your value delivery. And to do that, you have to digitize every step in that path.

NetApp views the threat to IT as originating from LoB/shadow IT originating applications born and grown in the cloud or other groups creating next gen applications using capabilities outside of IT.

Product direction

  • NetApp is looking mostly towards the cloud. At their conference they announced a new Azure NFS service powered by NetApp. They already had Cloud ONTAP and NPS, both current cloud offerings, a software defined storage in the cloud and a co-lo hardware offering directly attached to public cloud (Azure & AWS), respectively.
  • Vantara is looking towards IoT. At their conference they announced Lumada 2.0, an Industrial IoT (IIoT) product framework using plenty of Hitachi software functionality and intended to bring data and analytics under one software umbrella.

NetApp is following a path laid down years past when they devised the data fabric. Now, they are integrating and implementing data fabric across their whole product line. With the ultimate goal that wherever your data goes, the data fabric will be there to help you with it.

Vantara is broadening their focus, from IT products and solutions to IoT. It’s not so much an abandoning present day IT, as looking forward to the day where present day IT is just one cog in an ever expanding, completely integrated digital entity which the new organization becomes.

They both had other announcements, NetApp announced ONTAP 9.3, Active IQ (AI applied to predictive service) and FlexPod SF ([H]CI with SolidFire storage) and Vantara announced a new IoT turnkey appliance running Lumada and a smart data center (IoT) solution.

Who’s right?

They both are.

Digitization is the future, the sooner organizations realize and embrace this, the better for their long term health. Digitization will happen with or without organizations and when it does, it will result in a significant re-ordering of today’s competitive landscape. IoT is one component of organizational digitization, specifically outside of IT data centers, but using IT resources.

In the mean time, IT must become more effective and efficient. This means it has to modernize to free up resources to support (hybrid) cloud applications and supply the infrastructure needed for next gen applications.

One could argue that Vantara is positioning themselves for the long term and NetApp is positioning themselves for the short term. But that denies the possibility that IT will have a role in digitization. In the end both are correct and both can succeed if they deliver on their promise.



Two paths to better software

Read an article last week in the Atlantic, The coming software apocalypse, about some of the problems in how we develop software today.

Most software development today is editing text files. Some of these text files have 1,000s of lines and are connected to other text files with 1,000s of more lines which are connected to other text files with 1,000s of lines, etc. Pretty soon you have millions of lines of code all interacting with one another.

The problem

Been there done that and it’s not pretty. We even spent some time trying to reduce the code bloat by macro-izing some of it, and that just made it harder to understand, but reduced the lines of code.

The problem is much worse now where . we have software everywhere you look, from the escalator-elevator you take up and down between floors, to the cars you drive around town, to the trains and airplanes you travel between cities.

All of these literally have millions of lines of code controlling them and are many more each year. How can they all possibly be correct.

Well you can test the s&*t out of them. But you can’t cover every path in a lifetime or ten of testing a million line program. And even if you could, changing a single line would generate another 100K or more paths to test. So testing was never a true answer.

Two solutions

The article talks about two approaches that have some merit to solve the real problem.

  • Model based development, a new development and coding environment. In this approach your not so much coding as playing with a model of the behavior your looking for. Say you were coding robot control logic, rather than editing 1000s of lines of Java text, you work with a model of your robot and its environment on 1/2 a screen and on the other half, model parameters (dials, sliders, arrow keys, etc) and logic (sequences) that you  manipulate to do what the robot needs to do. Sort of like Scratch on steroids (see my post on 10 years of Scratch) with the sprite being whatever you need to code for be it a jet engine, automobile, elevator, whatever. The playground would be a realtime/real life simulation of the entity under control of the code and you would code by setting parameters  and defining sequences. But the feedback would be immediate!
  • TLA+ a formal design verification approach. Formal methods have been around since the early 70s. They are used to rigorously specify a design of  some code or a whole system. The idea is that if you can specify a  provably correct design, then the code (derived from that design) has the potential to be more correct. Yes there’s still the translation from code to design that’s error prone but the likelihood is that these errors will be smaller in scope than having a design that wrong.

Model based  development

One can find model based development already in the Apple new application development language, Swift, ANSYS SCADE suite based on Esterel Technologies, and Light Table software development environment.

I have never used any of them but they all look interesting. Esterel was developed for safety critical, real-time aerospace applications. Light Table was a kickstarter project started by a leading engineer of Microsoft’s Visual Studio, the leading IDE. Apple Swift was developed to make it much easier to develop IOS apps.


TLA+ takes a bit getting used to. All formal methods depend on advanced mathematics and sophisticated logic and requires an adequate understanding of these in order to use properly. TLA+ was developed by Leslie Lamport and stands for temporal logic of actions.

TLA+ specifications identify the set of all correct system actions. I would call it a formal pseudo code.

There’s apparently a video course , a hyperbook and a book on the language It’s being used in AWS and Microsoft XBOX and Azure. (See the wikipedia TLA+ article for more information).

There’s PlusCal algorithm (specification) language which is translated into a TLA+ specification which can then be checked by the automated TLC model checker.  There’s also an automated TLAPS, a TLA+ proof system although it doesn’t support all of the TLA+ primitives.  There’s a whole TLA+ toolbox that has these and other tools that can make TLA+ easier to use.


We dabbled in formal specifications methods for on our million+ line storage system at a former employer. It worked well and cleaned up a integrity critical area of the product. Alas, we didn’t expand it’s use to other areas of the product and it sort of fell out of favor. But it worked when and where we applied it.

Of course this was before automated formal methods of today, but even manual methods of specification precision can be helpful to think out what a design has to do to be correct.

I have no doubt that both TLA+ formal methods and model based development approaches and more are required to truly vanquish the coming software apocalypse.

At least until artificial intelligence starts developing all our code for us.


Photo Credits: Six easy pieces of quantitatively analyzing open source, SAP Research;

Spaghetti code still existed,;

How to write apps with Swift, MacWorld;

Modeling the dining philosophers problem in TLA+, Metadata blog


Compressing information through the information bottleneck during deep learning

Read an article in Quanta Magazine (New theory cracks open the black box of deep learning) about a talk (see 18: Information Theory of Deep Learning, YouTube video) done a month or so ago given by Professor Naftali (Tali) Tishby on his theory that all deep learning convolutional neural networks (CNN) exhibit an “information bottleneck” during deep learning. This information bottleneck results in compressing the information present, in for example, an image and only working with the relevant information.

The Professor and his researchers used a simple AI problem (like recognizing a dog) and trained a deep learning CNN to perform this task. At the start of the training process the CNN nodes at the top were all connected to the next layer, and those were all connected to the next layer and so on until you got to the output layer.

Essentially, the researchers found that during the deep learning process, the CNN went from recognizing all features of an image to over time just recognizing (processing?) only the relevant features of an image when successfully trained.

Limits of deep learning CNNs

In his talk the Professor identifies two modes of operations of a deep learning CNN: the encoder layers and decoder layers. The encoder function identifies relevant information in the input and the decoder function takes this relevant information and maps this to an output.

This view results in two statistics that can characterize any deep learning CNN:

  • Sample complexity which refers to the the mutual information inside the last hidden layer of the encoder function, and
  • Accuracy or generalization error, which refers to the mutual information inside the last hidden layer of the decoder function.

Where mutual information is defined as how much of the uncertainty of an input is removed when you have an output that is based on that input. (See the talk for a more formal explanation).

The professor states that any complex deep learning CNN can be characterized by these two statistics where sample complexity determines the number of samples required and accuracy determines the precision by which the deep learning CNN can properly interpret those samples. The deep black line in the chart represents the limits of accuracy achievable at some number of training events, with some number of hidden layers and some sample set.

What happens during deep learning

Moreover, the professor shows an interesting characteristic of all CNNs is that they converge over time in accuracy and that convergence differs based mostly on the number of layers, sample size and training count used.

In the chart, the top row show 3 CNNs with different amounts of training data (5%, 40% and 80% of total). The chart shows the end result and trace of learning within the CNN over the same number of epochs (training cycles). More training data generates more accurate results.

The Professor views those epochs after the farthest right traces (where the trace essentially starts moving up and to the left in the chart), the compression phase of deep learning.

Statistics of deep learning process

The professor goes on to characterize the deep learning  process by calculating the mean and variance of each layers connection weights.

In the chart he shows an standard “eiffel tower” neural network, with 6 hidden layers, each with less neurons (nodes)  than the previous layer (12 nodes, 10 nodes, 7 nodes, etc.). And what he plots is the average weights and variance between layers (red lines are average and variance of the weights for arcs[connections] between nodes in layer 1 to nodes in layer 2, blue lines the mean and variance of weights for arcs between layer 2 and 3, purple lines the mean and variance of weights for arcs between layer 3 and 4, etc.).

He shows that at the start of training the (randomly assigned) weights for each layer have a normalized mean which is higher than its normalized variance. He calls this phase as high signal to noise (I would say the opposite, its low signal to noise, more noise than signal). But as training proceeds (over more epochs), there comes a point where the layer mean drops below its variance and the signal to noise ratio changes dramatically. After that point the mean weights and variance of the group of layers start to diverge or move apart.

The phase (epochs) after the line where the weights means are lower than its variance, he calls the Compression phase of the deep layer CNN training.

The Professor suggests that every complex deep learning CNN looks the same during training if you perform the calculations. The professor shows charts like this for other deep learning CNNs used on different problems and they all exhibit some point where their means are lower than their weights after which means and variances between layers starts to differentiate.

Do layer counts and sample size matter?

It turns out that the more hidden layers you have, the sooner (less training) you need to begin the compression phase. This chart shows the same problem, with different hidden layer counts. One can see in the traces, that not only is accuracy improved with more layers but it also more quickly reaches the compression phase.

Using his sample complexity and accuracy statistics, the Professor has also shown that their are limits to the amount of accuracy to any deep learning CNN based on the function of layer counts, sample size and training event counts.


As far as I know, The Professor and his team are the first to try to characterize and understand what happens during deep learning. In doing so, he has shown that the number of layers and the number of samples can be used to predict the speed of learning. And ultimately how accurate any deep learning CNN can be.


Hyperloop One in Colorado?

Read a couple of articles last week (TechCrunch, ArsTechnica & Denver Post) about Colorado becoming a winner in the Hyperloop One Global Challenge. The Colorado Department of Transportation (DoT) have joined with Hyperloop One to commission a study on Hyperloop transportation across the front range, from Cheyenne, WY to Pueblo, CO.

There’s been talk forever about adding a passenger train in Colorado from Fort Collins to Pueblo but every time they look at it they can’t make the economics work. How’s this different?

Transportation and the Queen city of the Prairie

Transportation has always been important to Denver. It was the Denver Pacific railroad from Denver to Cheyenne that first linked Denver to the rest of the nation. But even before that there was a stage coach line (Leavenworth & Pike’s Peak Express) that went through Denver to reduce travel time. Denver is currently the largest city within 500 miles and the second only to Phoenix as the most populus city in the mountain west.

Denver International Airport is a major hub and the world’s sixth busiest airport. Denver is a cross road for major north-south and east-west highways through the mountain west. Both the BNSF and Union Pacific railroads serve Denver and Denver is one of the major stops on the Amtrak  passenger train from San Francisco to Chicago.

Why Hyperloop?

Hyperloop can provide much faster travel, even faster than airplanes. Hyperloop can go up to 760 mph (1200 km/h) and should average 600 mph (970 km/h) from point to point

Further, it could potentially require less security.  Hyperloop can go above or below ground. But in either case a terrorist act shouldn’t be as harmful as one on a plane thats traveling at 20 to 30,000 feet in the air.

And because it can go above or below ground it could potentially make use current transportation right of way corridors for building its tubes. Although to go west, it’s going to need a new tunnel or two through the mountains.

Stops along the way

The proposed hyperloop track will bring it through Greeley and as far west as Vail. For a total of 360 miles. Cheyenne to Pueblo have about 10 urban centers between and west of them (Cheyenne, Fort Collins, Greely, Longmont-Boulder, Denver, Denver Tech Center [DTC], West [Denver] metro, Silverthorne/Dillon, Vail, Colorado Springs and Pueblo).

Cheyenne to Pueblo is is 213 miles apart and ~3.5 hr drive with Denver at about the 1/2 way point. With Hyperloop, Denver to either location should take ~10 minutes without stops and the total trip, Cheyenne to Pueblo should be ~21 minutes.

Yes but is there any demand

I would think the way to get a handle on any potential market is to examine airline traffic between these cities. Airplanes can travel at close to these speeds and the costs are public.

But today there’s not much airline traffic between Cheyenne, Denver and Pueblo.  Flights to Vail are mostly seasonal. I could only find one flight from Denver to Cheyenne over a week, one flight between Cheyenne and Pueblo, and 16 flights between Denver and Pueblo. The airplanes used on these trips only holds 9 passengers, so maybe that would amount to a maximum of 162 air travelers a week.

The other approach to estimating potential passengers is to use highway traffic between these destinations. Yes the interstate (I25) from Cheyenne through Denver to Pueblo is constantly busy and needs another lane or two in each direction to handle peak travel. And travel to Vail is very busy during weekends. But how many of these people would be willing to forego a car and travel by Hyperloop?

I travel on tollroads to get to the Denver Airport and it’s a lot faster then traveling non-tollroad highways. But the cost for me is a business expense and it’s not that frequent. These days there’s not much traffic on my tollroad corridor and at rush hour, there’s very few times where one has to slow down. But there are plenty of people coming to the airport each day from the NorthWest and SouthEast Denver suburbs that could use these tollroads but don’t.

And what can you do in Pueblo, Cheyenne or Denver for that matter without a car. It depends on where you end up. The current stops in Denver include the Denver International Airport, DTC, or West Metro (Golden?). Denver, Golden, Boulder, Vail, Greeley and Fort Collins all have compact downtowns with decent transportation. But for the rest of the stops along the way, you will probably want access to a car to get anywhere. There’s always Uber and Left and worst case renting a car.

So maybe Hyperloop would compete for all air travel and some portion of the car travel between along the Cheyenne to Denver to Pueblo. It just may not be large enough.

Other alternative routes

Why stop at Cheyenne, what about Jackson WY or Billings MT? And why Pueblo what about Sante Fe and Albuquerque in NM. And you could conceivably go down to Brownsville, TX and extend up to Calgary and Edmonton in Alberta, Canada, if it made sense. I suppose it’s a question of how many people for what distance.

I would think that going east-west would be more profitable. Say Kansas City to Salt Lake City with Denver in between. With this corridor: 1) the distances are longer (Kansas to Salt Lake is 910 mi [~1465 km]); 2) the metropolitan areas are much larger; and 3) the air travel between them is more popular.

There are currently 10 winners for Hyperloop One’s Global Challenge Contest.  The other routes in the USA include Texas (Dallas, Houston & San Antonio), Florida (Miami to Orlando), & the midwest (Chicago IL to Columbus OH to Pittsburgh PA). But there are others in Canada and Mexico in North America and more in Europe and India.

Hyperloop One will “commit meaningful business and engineering resources and work closely with each of the winning teams/routes to determine their commercial viability.” All this means that each of the winners will be examined professionally to see if it makes economic sense.

Of the 10 winners, Colorado’s route has the least population, almost by a factor of 2. Not sure why we are even in contention, but maybe it’s the ease of building the tubes that makes us a good candidate.

In any case, the public-private partnership has begun to work on the feasibility study.


Photo Credit(s): 7 hyperloop facts Elon Musk would love us to know, Detechter

Take a ride on Hyperloop…, Daily Mail