Achieve Maximum Efficiency

Intel XML

Subscribe to Intel XML: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Intel XML: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Intel XML Authors: Jnan Dash, Jayaram Krishnaswamy, Jason Bloomberg, Chris Pollach, Peter Silva

Related Topics: Intel XML, XML Magazine, F5 Networks, DevOps for Business Application Services, Big Data on Ulitzer, SDN Journal, DevOps Journal

DevOpsJournal: Blog Feed Post

Programmability in the Network: Because Big Data Is Often Unstructured

Adaptability of network-hosted services requires programmability because data doesn't always follow the rules

Evans Data recently released its Data & Advanced Analytics Survey 2013 which focuses on "tools, methodologies, and concerns related to efficiently storing, handling, and analyzing large datasets and databases from a wide range of sources." Glancing through the sample pages reveals this nugget on why developers move from traditional databases to more modern systems, like Hadoop:

The initial motivating factors to move away from traditional database solutions were the total size of the data being processed – it being big data – and the data’s complexity or unstructured nature.

It's that second reason (and the data from Evans says it's only second by a partial percentage point) that caught my eye. Because yes, we all know big data is BIG. REALLY BIG. HUGE. Otherwise we'd call it something else. It's the nature of the data, the composition, that's just as important - not only to developers and how that data is represented in a data store, but to the network and how it interacts with that data.

See, the "network" and network-hosted services (firewalls, load balancers, caches, etc...) are generally used to seeing very clearly (RFC) defined structured data. Switches are routers are fast because the data they derive decisions from is always in the same, fixed schema.

In an increasingly application-driven data center, however, it is the application - and often its data - that drives network =decisions. This is particularly true for higher-order network services (L4-7) which specifically act on application data to improve performance, security and increasingly, supportive devops-oriented architectural topologies such as A/B testing, Canary Deployments and Blue/Green Architectures.

That data is more often than not unstructured. It's the mechanism by which the unstructured big data referenced by Evans is transferred to the application that ultimately deposits it in a data base somewhere. Any data, structured or not, traverses a network of services before it reaches its penultimate destination.

Programmability is required for devops and networking teams to implement the architectures and services necessary to support those applications and systems which exchange unstructured data. Certainly there is value in programmability when applied to structured data, particularly in cases where more complex logic is required to make decisions, but it is not necessarily required to enable that value. That's because capabilities that act on structured (fixed) data can be integrated into a network-hosted service and exposed as a configurable but well-understood feature.

But when data is truly unstructured and where there is no standard - de facto or otherwise - then programmability in the network is necessary to unlock architectural capabilities. The reason intermediaries can be configured to "extract and act" on data that appears unstructured, like HTTP headers, is because there are well-defined key-value pairs. Consider "Cookie" and "Cache-control" and "X-Forward-For" (not officially part of the standard, hence the "X" but accepted as an industry, de facto standard) as good examples. While not fixed, there is a structure to HTTP headers that lends itself well to both programmability and "extract and act" systems.

To interact with non-standard headers, however, or to get at unstructured data in a payload, requires programmability on the level of executable logic rather than simple, configurable options. A variety of devops-related architectures and API proxy capabilities require programmability due to the extreme variability in implementation. There's simply no way for an intermediary or proxy to "out of the box" support such a wide-open set of possibilities because the very definition of the data is not common. Even though it may be structured in the eyes of the developer, it's still unstructured because there is no schema to describe it (think JSON as opposed to XML) and it follows no accepted, published standard.

The more unstructured data we see traversing the network, the more we're going to need programmability in the network to enable the modern architectures required to support it.

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.