ClearSky Data Attains Advanced Technology Partner Status In The Amazon Web Services Partner Network

ClearSky Data Attains Advanced Technology Partner Status In The Amazon Web Services Partner Network

ClearSky Data has announced it has achieved Advanced Technology Partner status in Amazon Web Service (AWS) Partner Network (APN). AWS’ global partner program focuses on helping APN Partners with go-to-market and technical support. ClearSky Data’s expanded relationship with AWS furthers its mission to help more companies get to the cloud quickly, so they can eliminate secondary infrastructure, take advantage of hybrid DR and dramatically reduce costs with consumption-based enterprise storage.

As enterprise IT teams struggle to consolidate multiple infrastructure silos and free themselves from painful upgrade cycles, the case for cloud migration is escalating. At the same time, these organizations are tasked with multiple data management challenges, from backing up archival information assets to rapidly ingesting massive amounts of machine data at the edge. ClearSky Data provides a storage-as-a-service solution integrated with AWS, enabling enterprises to access all of this data wherever it’s needed — on-premises or in the cloud — without replication.

“Our partnership with AWS extends our ability to help enterprises seamlessly get to the cloud and get out of their secondary data centers,” said Ellen Rubin, CEO and co-founder of ClearSky Data. “As an Advanced Technology Partner in the APN, ClearSky Data will continue to innovate at the place where the edge meets the cloud, so enterprises can more easily and efficiently manage their data today and into the future.”

“Our hybrid cloud storage clients want the economics of the cloud, but with the same level of security and performance as on-premises workloads,” said Arthur Olshansky, CEO of Federal Hill Solutions and founder of Molnii Cloud, an MSP that delivers hybrid IT services and solutions. “ClearSky Data’s support of AWS technology helps us offer high-performance data access, primary storage, offsite backup and disaster recovery, reducing the costs and complexity seen with traditional storage.”

Source: CloudStrategyMag

IDG Contributor Network: Dataops: agile infrastructure for data-driven organizations

IDG Contributor Network: Dataops: agile infrastructure for data-driven organizations

About a decade ago, the software engineering industry reinvented itself with the development and codification of so-called devops practices. Devops, a compound of “development” and “operations,” refers to a set of core practices and processes that aim to decrease time to market by thoughtfully orchestrating the tight integration between software developers and IT operations, emphasizing reuse, monitoring, and automation. In the years since its introduction, devops has taken the enterprise software community by storm garnering respect and almost-religious-like reverence from practitioners and devotees.

Today, at the dawn of 2018, we are seeing a subtle but profound shift that warrants a reexamination of established software development practices. In particular, there is a growing emphasis on leveraging data for digital transformation and the creation of disruptive business models concomitant with the growth of data science and machine learning practices in the enterprise. As adoption of big data computing platforms and commodity storage becomes more widespread, the ability to leverage large data sets for enterprise applications is becoming economically feasible. We’re seeing massive growth in investments in the development of data science applications—including deep learning, machine learning, and artificial intelligence—that involve large volumes of raw training data. The insights and efficiencies gained through data science are some of the most disruptive of enterprise applications.

However, the goals and challenges to building a robust and productive data science practice in the enterprise are distinct from the challenges of building traditional, lightweight applications that do not rely on large volumes of persistent data. These new challenges have motivated the need to go beyond devops to a more data-centric approach to building and deploying data-intensive applications that includes a holistic data strategy. While the principal goals of devops—namely agility, efficiency, and automation—remain important today as ever, the requirements of leveraging massive volumes of persistent data for new applications has spawned a cadre of practices that extend devops in important ways to support data-intensive applications—hence dataops.

Dataops in the enterprise is a cross-functional process that requires the close collaboration of multiple groups to build, deploy, secure, and monitor data-intensive applications. A dataops process brings together teams from Development (to build the application logic and architecture), Operations (to deploy and monitor applications), Security & Governance (to define the data access policies for both production and historical data sets), Data Science (to build data science and machine learning models that become part of larger applications), and Data Engineering (to prepare training data sets for the data science team).

Consider the development process for a prototypical data-intensive application today. First, data-intensive applications often embed data science or machine learning functions as part of the application logic. Data scientists build these models through an iterative training process that typically relies on large volumes of training data.

Once the models have been trained, they can be deployed or embedded into a larger application that a software developer would implement. This paradigm of leveraging data to build the application logic itself is a major shift; before the data science and machine learning renaissance of the past few years, application logic was designed wholly by the developer without needing to run large experiments and therefore without relying on large volumes of persistent data.

After the application is deployed to a production environment, the embedded data science or machine learning models can be rescored and therefore improved over time. As a result, the data science model might be redeployed independent of any other changes to the overall application logic. Whereas devops practices promote agility by allowing application logic to be continuously deployed to reflect the addition of new features or fixes to the application, data-intensive applications extend this philosophy by emphasizing also continuous model deployment to deploy newly trained or rescored data science models to existing production applications. Underlying this whole process, of course, is the need to ensure that the data used to train and rescore the models, as well as the production data streams, are governed and secured properly.

Dataops focuses on the business value of data science and machine learning by improving the time to market for intelligent, data-intensive applications. While still emerging as an enterprise practice, dataops is increasingly driving teams to collaborate and organize in new ways to build, manage, deploy and monitor data-intensive applications. Fundamentally, dataops puts data squarely at the heart of application development considerations and turns conventional application-centric thinking on its head.

This article is published as part of the IDG Contributor Network. Want to Join?

Source: InfoWorld Big Data

IDG Contributor Network: AI: the challenge of data

IDG Contributor Network: AI: the challenge of data

In the last few years, AI has made breathtaking strides driven by developments in machine learning, such as deep learning. Deep learning is part of the broader field of machine learning that is concerned with giving computers the ability to learn without being programmed. Deep learning has had some incredible successes.

Arguably, the modern era of deep learning can be traced back to the ImageNet challenge in 2012. ImageNet is a database of millions of images categorized using nouns such as “strawberry,” “lemon,” and “dog.” During this challenge, a convolutional neural network (CNN) could achieve an error rate of 16 percent (before that, the best algorithm could only achieve a 25 percent error rate).

One of the biggest challenges of deep learning is the need for training data. Large volumes of data are needed to train networks to do the most rudimentary things.  This data must also be relatively clean to create networks that have any meaningful predictive value. For many organizations, this makes machine learning impractical. It’s not just the mechanics of creating neural networks that’s challenging (although this is itself a hard task), but also the way to organize and structure enough data to do something useful with it.

There is an abundance of data available in the world—more than 180 zettabytes (1 zettabyte is equal to 1 followed by 21 zeros) predicted by 2025. Ninety-nine percent of the data in the world is not yet analyzed, and more than 80 percent of it is unstructured, meaning that there is plenty of opportunity and hidden gems in the data we are collecting. Sadly, however, much of this data is not in any state to be analyzed.

So, what can enterprises do?

You need to think about data differently from how you do today. Data must be thought of as a building block for information and analytics. It must be collected to answer a question or set of questions. This means that it must have the following characteristics:

  • Accuracy: While obvious, the data must be accurate.
  • Completeness: The data must be relevant, and data that is necessary to answer the question asked must be present. An obvious example of incomplete data would be a classroom where there are 30 students, but the teacher calculates the average for only 15.
  • Consistency: If there is one database indicating that there are 30 students in a class and a second database showing that there are 31 in the same class then this is an issue.
  • Uniqueness: If a student has different identifiers in two separate databases, this is an issue as it opens the risk that information won’t be complete or consistent.
  • Timeliness: Data can change, and the AI model may need to be updated.

Beyond the data itself, there are severe constraints that can impede analytics and deep learning, including security and access, privacy, compliance, IP protection, and physical and virtual barriers. These constraints need to be thought about. It doesn’t help the enterprise if it has all the data but the data is inaccessible for various reasons. Often, steps need to be taken such as scrubbing the data so that no private content remains. Sometimes, agreements need to be made between parties that are sharing data, and sometimes technical work needs to happen to move the data to locations where it can be analyzed. Finally, the format and structure of the data needs to be considered. Recently, I was looking at the currency rates from the Federal Reserve going back 40 years for a personal project and then, in one of those head-slapping moments, I realized that there was a discontinuity from 1999 onwards: The euro had replaced most European currencies. There was a way I could mitigate the problem, but it was deeply unsatisfying. Legacy data might be plentiful, but may be incompatible with the problem at hand.

The moral of the story is that we are deluged with data, but often the conditions do not allow the data to be used. Sometimes, enterprises are lucky, and with some effort, they can put the data into good shape. Very often, enterprises will need to rethink how to collect or transform data to a form that is consumable. Agreements can be made to share data or merge data sets, but completeness issues often remain.

As noted earlier, the key to success is to start with a question and then structure the training data or collect the right data to answer the question. While immense barriers remain in collecting training data, there is clearly a push by enterprises toward higher quality data evinced by the growing influence of data scientists. I am very optimistic that the corpus of high-quality training data will improve, thus enabling a wider adoption of AI across enterprises of all sizes.

This article is published as part of the IDG Contributor Network. Want to Join?

Source: InfoWorld Big Data

VAZATA Chooses ZERTO As IT Resilience Partner for Disaster Recovery

VAZATA Chooses ZERTO As IT Resilience Partner for Disaster Recovery

VAZATA has announced it has chosen Zerto to build its cloud-based infrastructure as a service (IaaS) disaster recovery offering. The partnership helps IT teams achieve enterprise-class data replication in real-time for disaster recovery within their virtual environments faster and more affordably.

The combination of VAZATA’s cloud IaaS platform with Zerto’s IT resilience software platform provides full control over disaster recovery processes in cloud virtual environments. This minimizes hardware costs and disaster recovery complexity, supporting extremely quick Recovery Time Objectives (RTO) of minutes and Recovery Point Objectives (RPO) of seconds.

“Uninterrupted cloud computing has become a non-negotiable best practice for business,” said Wade Thurman, chief operations officer, VAZATA. “Even a few minutes’ unavailability of data and applications puts a business at risk.”

“VAZATA offers an IaaS platform architected on enterprise-class equipment that delivers rapid provisioning, flexible customization, and outstanding compliance and security, including FedRAMP accreditation,” said Don Wales, vice president global cloud sales, Zerto. “VAZATA’s flexible, secure, and scalable cloud solutions help Zerto customers accomplish their disaster recovery mission.”

The Federal Risk and Authorization Management Program (FedRAMP) is a government-wide program that provides a standardized approach to security assessment, authorization, and continuous monitoring for cloud products and services. VAZATA was the first Infrastructure-as-as-Service (IaaS) organization to earn an Authority to Operate (ATO) for the U.S. Federal Government’s “Cloud First” policy.

Source: CloudStrategyMag

2017 Review Shows $180 billion Cloud Market Growing at 24% Annually

2017 Review Shows 0 billion Cloud Market Growing at 24% Annually

New data from Synergy Research Group shows that across six key cloud services and infrastructure market segments, operator and vendor revenues for the four quarters ending September 2017 reached $180 billion, having grown by 24% on an annualized basis. IaaS & PaaS services had the highest growth rate at 47%, followed by enterprise SaaS at 31% and hosted private cloud infrastructure services at 30%. 2016 was notable as the year in which spend on cloud services overtook spend on hardware and software used to build public and private clouds, and in 2017 the gap widened. In aggregate cloud service markets are now growing over three times more quickly than cloud infrastructure hardware and software. Companies that featured the most prominently among the 2017 market segment leaders were Amazon/AWS, Microsoft, IBM, Salesforce, Dell EMC, HPE, and Cisco.

Over the period Q4 2016 to Q3 2017, total spend on hardware and software to build cloud infrastructure approached $80 billion, split evenly between public and private clouds, though spend on public cloud is growing more rapidly. Infrastructure investments by cloud service providers helped them to generate over $100 billion in revenues from cloud infrastructure services (IaaS, PaaS, hosted private cloud services) and enterprise SaaS — in addition to which that cloud provider infrastructure supports internet services such as search, social networking, email, e-commerce and gaming. Meanwhile UCaaS, while in many ways a different type of market, is also growing strongly and is driving some radical changes in business communications.

“We tagged 2015 as the year when cloud became mainstream and 2016 as the year when cloud started to dominate many IT market segments. In 2017 cloud was the new normal,” said John Dinsdale, a Chief Analyst and Research Director at Synergy Research Group. “Major barriers to cloud adoption are now almost a thing of the past, with previously perceived weaknesses such as security now often seen as strengths. Cloud technologies are now generating massive revenues for cloud service providers and technology vendors and we forecast that current market growth rates will decline only slowly over the next five years.”

Source: CloudStrategyMag

Trilio Data Predicts 2018 Will Be The Year Of The Hybrid Cloud

Trilio Data Predicts 2018 Will Be The Year Of The Hybrid Cloud

Trilio Data is sharing its 2018 predictions.

2018 Will be the Year of the Hybrid Cloud

Enterprises are increasingly adopting the cloud for data management and the hybrid cloud, the use of both private and public clouds, is heading the same way.  According to a SUSE survey, hybrid cloud strategies are growing faster than private or public cloud, with 66% of respondents expecting hybrid cloud growth to continue, compared to 55% for private cloud and 36% for public cloud. Additionally, MarketsandMarkets estimates that the hybrid cloud market will grow from $33.28 Billion in 2016 to USD $91.74 Billion by 2021, at an annual growth rate of more than 22%.

“Organizations evaluating cloud solutions have traditionally been concerned about data privacy, security, and sovereignty and private clouds were seen as the way to address these issues,” said David Safaii, CEO, Trilio Data. “But if there are data security concerns, enterprises should use the security method that makes the most sense, and will increasingly seek out hybrid or multi-cloud approaches. This approach maps performance, data protection, and cost to each workload.”

GDPR Changes the Face of Data Storage and Protection
The EU’s General Data Protection Regulations (GDPR) will serve as a catalyst for enterprises investing in data governance, and change the face of data protection as we know it. With the impending May 2018 GDPR deadline, companies are seeking out next generation data protection solutions to replace or fill in gaps. This is happening regardless of their current solution and use of legacy deployments. As public clouds become riskier options under GDPR, this will further bolster interest in hybrid cloud.

“Gartner estimates that fewer than half of all organizations affected by GDPR will be in full compliance by the end of 2018,” said Murali Balcha, CTO and Founder, Trilio. “While GDPR will call for seismic shifts in how businesses manage their data, it may take seeing a non-compliant organization receive a blockbuster fine for some to really take action.”

U.S. Adoption of OpenStack Accelerates
OpenStack has become a popular and strong alternative to traditional IT service delivery models for large enterprises worldwide. An OpenStack User Survey released earlier this year found that while overall adoption of OpenStack has dramatically increased, sixty-one% of users and 74% of their deployments are physically located outside of the United States.

“We will continue to see strong growth of OpenStack in international markets, but 2018 will be the year we see an accelerated uptick in U.S.-based OpenStack deployments,” said Balcha.

Source: CloudStrategyMag

IDG Contributor Network: In 2018, can cloud, big data, and AI stand more turmoil?

IDG Contributor Network: In 2018, can cloud, big data, and AI stand more turmoil?

The amount of new technologies in 2017 has been overwhelming: The cloud was adopted faster than analysts projected and brought several new tools with it; AI was introduced into just about all areas of our lives; IoT and edge computing emerged; and a slew of cloud-native technologies came into fruition, such as Kubernetes, serverless, and cloud databases, to name a few. I covered some of these a year ago in my 2017 predictions and it’s now time to analyze the trends and anticipate what will likely happen in the tech arena next year.

While we love new tech, the average business owner, IT buyer, and software developer glaze over this massive innovation and don’t know how to start turning it into business value. We will see several trends emerge in 2018, and their key focus will be on making new technology easy and consumable.

Integrated platforms and everything becomes serverless

Amazon and the other cloud providers are in a race to gain and maintain market share, so they keep on raising the level of abstraction and cross-service integration to improve developer productivity and strengthen customer lockins. We saw Amazon introducing new database-as-a-service offerings and fully integrated AI libraries and tools at last month’s AWS Re:Invent. It also started making a distinction between different forms of serverless: AWS Lambda is now about serverless functions, while AWS Aurora and Athena are about “serverless databases,” broadening the definition of serverless to any service that hides underlying servers. Presumably, many more cloud services will now be able to call themselves “serverless” by this wider definition.

In 2018, we will see the cloud providers placing a greater emphasis on further integrating individual services which come with higher level abstractions. They will also focus on services related to AI, data management, and serverless. These solutions will make the jobs of developers and operations professionals simpler and hide their inherent complexities. However, they do carry a risk of greater lockin.

In 2017, we saw all cloud providers aligning with Kubernetes as the microservices orchestration layer, which relieved some of the lockin. In 2018, we will see a growing set of open and commercial services built on top of Kubernetes that can deliver a multicloud alternative to proprietary cloud offerings. Iguazio’s Nuclio is of course a great example for such an open and multicloud serverless platform, as is Red Hat’s Openshift multicloud PaaS.

The intelligent edge vs. the private cloud

The cloud enables the required business agility that is necessary to develop modern and data-driven applications, whether it be at startups or at large enterprises operating like startups. The challenge is that you can’t ignore data gravity, as many data sources still live at the edge or in the enterprise. This—augmented by 5G bandwidth, latency, new regulations like GDPR, and more—forces you to place computation and storage closer to the data sources.

Today’s public cloud model is of service consumption, so that developers and users can bypass IT, bring some serverless functions, use self-service databases, or even upload a video to a cloud service that returns it with a translation to the desired language.  But you must build the services ourselves when you use the on-prem alternatives, and the technology stack is evolving so rapidly, it is virtually impossible for IT teams to build modern services that can compare with cloud alternatives, forcing organizations out to the cloud.

IT vendor solutions labeled “private cloud” are nothing like the real cloud, because they focus on automating IT ops. They don’t provide higher level user and developer-facing services—IT ends up assembling those out of dozens of individual open source or commercial packages, adding common security layers, logging and configuration management, etc. This has opened the opportunity for cloud providers and new companies to enter the edge and on-prem space.

In 2017, we saw Microsoft CEO Satya Nadella increasing focus on what he calls “the intelligent edge.” Microsoft introduced the Azure Stack, which is a mini version of Azure’s cloud and unfortunately contains only a small portion of the services Microsoft offers in the cloud. Amazon started delivering edge appliances called Snowball Edge, and I expect it to double down on those efforts.

The intelligent edge is not a private cloud. It provides an identical set of services and operations models as in the public cloud, but it’s accessed locally and is in many cases operated and maintained from a central cloud, just like operators manage our cable set-top-boxes.

In 2018, we will see the traditional private cloud market shrinking while at the same time the momentum around the intelligent edge will grow. Cloud providers will add or enhance edge offerings and new companies will enter that space, in some cases through integrated offerings to specific vertical applications or use cases.         

AI from raw technology to embedded feature and vertical stacks

We saw the fast rise of AI and machine learning technologies in 2017 but despite the hype, it is in reality mainly getting used by market leading web companies like Amazon, Google, and Facebook. AI is far from trivial for the average enterprise, but there is really no reason for most organizations to hire scarcely available data scientists or build and train AI models from scratch.

We can see how companies like Salesforce built AI into its platform, leveraging the large amount of customer data it hosts. Others are following that path to embed AI into offerings as a feature. At the same time we see AI getting a vertical focus, and we’ll start seeing AI software solutions for specific industries and verticals such as marketing, retail, health care, finance, and security. Users won’t need to know the internals of neural networks or regression algorithms in these solutions. Instead, they will provide data and a set of parameters and get an AI model that can be used in their application.

AI is still a very new field with many overlapping offerings and no standardization. If you used a framework like TensorFlow, Spark, H2O, and Python for your learning phase, you’ll need to use the same for the inferencing part (scoring). In 2018, we will see efforts to define AI models that will be open and cross-platform. In addition, we will see more solutions which automate the process of building, training, and deploying AI, like the newly introduced AWS Sage Maker

From big data to continuous data

In the past few years, organizations have started developing a big data practice driven by central IT. Its goal has been to collect, curate, and centrally analyze business data and logs for future applications. Data has been collected in Hadoop clusters and data warehouse solutions and then used by a set of data scientists who run batch jobs and generate some reports, or dashboards. This approach has proven to fail according to all leading analysts, with 70 percent of companies not seeing any ROI (according to Gartner). Data must be actionable to gain ROI insights from it. It must be integrated into business processes and derived from fresh data, just like we see in targeted ads and in Google and Facebook recommendations.

Data insights must be embedded into modern business applications. For example, a customer accessing a website or using a chatbot needs to get an immediate response with targeted content based on his or her recent activities or profile. Sensor data collected from IoT or mobile devices flows in continuously and requires immediate actions to drive alerts, detect security violations, provide predictive maintenance, or enable corrective actions. Visual data is inspected in real time for surveillance and national security; it is also used by retailers to analyze point-of-sales data like inventory status, customer preferences, and real-time recommendations based on observed customer activities. Data and real-time analytics reduce business costs by automating processes that were once manual. Cars are becoming connected and autonomous. Telemarketers and assistants are replaced with bots. Fleets or trucks, cab drivers, or technicians are orchestrated by AI and event-driven logic to maximize resource utilization.

All this has already started happening in 2017.

Technologies like Hadoop and data warehousing were invented ten years ago and predated the age of AI, stream processing, and in-memory or flash technologies. Enterprises are now seeing that there is limited value in building data lakes, as they can perform data mining by using simpler cloud technologies. The focus is shifting from mostly just collecting data to using data continuously, an area in which technologies focused on data at rest and central IT-driven processes just won’t fly. 

In 2018, we will see an ongoing shift from big data to fast data and continuous data-driven applications. Data will be continuously ingested by a wide variety of sources. It will be contextualized, enriched and aggregated in real-time, compared against prelearned or continuously learned AI models so that it can then generate an immediate response to users, drive actions and be presented in real-time, interactive dashboards. 

Developers will use prepackaged cloud offerings or integrate their solutions by using relevant cloud-native services. In the enterprise, the spotlight will move from IT to the business units and application developers who will be embedding data-driven decisions in existing business logic, web portals, and day-to-day customer interactions.       

The bottom line for 2018 is:

  1. The intelligent edge will grow, and the traditional private cloud market will shrink.
  2. We’ll start seeing AI software solutions for specific industries and verticals. Also, AI models will start being open and cross-platform.
  3. Fast data, continuous applications and cloud services will replace big data and Hadoop.
  4. One way or another, cloud services will be easier to consume, thereby increasing in the gap between them and traditional and private cloud solutions. So bring on the shackles and get ready to be even more locked in!

Happy New Year!

This article is published as part of the IDG Contributor Network. Want to Join?

Source: InfoWorld Big Data