We used these same steps and strategy to help several major government organizations at various stages of data maturity. Read on to see how.
The client: United States Government
The challenge: This client had no scalable mechanism for storing massive amounts of spatial data and lacked the ability to index, query, search, and analyze it quickly and efficiently.
The solution: We developed GeoWave, a library for storage, index, and search of multi-dimensional data on top of a sorted key-value data store. GeoWave also had multi-dimensional indexing capabilities to using Apache Accumulo to store data in an open source format.
The outcome: This spatial index for Accumulo and Apache HBase can store massive amounts of information, and has been deployed across enterprise environments. GeoWave was released as an open source project in 2014 and is publicly accessible. Find it here: https://github.com/locationtech/geowave
The client: Defense Logistics Agency (DLA)
The challenge: The DLA had foundational data management issues, with disparate systems, including procure data and historical data, that was in multiple formats. They needed a way to bring massive amounts of data into one place to manage and analyze for better decision making.
The solution: We used our Open Data Platform capability to establish an integrated data environment. It lets the DLA to input, store, catalog, index, retrieve and build reports with advanced analytics across a wide variety of inputs.
The outcome: The DLA can now ingest and analyze vast quantities of data with ease in their centralized data lake. Moreover, the DLA can administer access to key stakeholders and team members to improve knowledge-sharing, reduce siloes, and drive data-driven decision making across their organization.
The client: Joint Improvised-Threat Defeat Organization (JIDO)
The challenge: JIDO needed to ingest and organize tons of data from hundreds of data sources consisting of millions of intelligence and operations summaries so they could leverage data to quickly interrogate and plot geographically to detect trends and illuminate threat networks for action.
The solution: We worked with JIDO to develop an open-source based cloud architecture called CATAPULT that uses machine learning to process and index every ingested word, location, and specific threat-related entities. Coping with the vast amounts of ever-changing data required a symbiotic team-based approach. An enterprise data team focused on brokering data agreements and data ingest options ranging from RSS feeds to email feeds to direct database connections depending on the source. A software development team developed “attack the network” tools such as HORIZON and SEARCH in an agile fashion to query and dashboard data for insights in support of decision making. Data integrators deployed globally across military commands and operational activities serve as data scouts to discover new feeds while working with analysts to provide data insights. As a by product of constant data interrogation, the data integrators frequently create new ways to apply multiple data feeds against dynamic problem sets and share data value via vignettes and storyboards.
The outcome: JIDO has an operationalized, enduring cloud solution. Deployed on several DOD network domains, this provides an ever-expanding corpus of data that can be dynamically sliced and diced using JIDO-developed tools to gain insight into emerging threats. This modular architecture and standards-based approach allows for continuous adaptation and innovation as new machine learning and data science technologies emerge.
The client: Chief Information Officer/Comptroller
The challenge: In this senior leadership role, our client is responsible for financial audit readiness; communicating and driving decisions for the organization based on financial data. To that end, our client needed to design and build a robust data platform that could ingest, store, index, and securely expose financial datasets to consumers. The organization also needed to build a dashboarding and reporting capability that could adapt to an ever-changing set of financial queries.
The solution: We worked with our client to identify the right tools and technology to build their data infrastructure. This included best-of-breed technologies like Apache Hadoop, Spark Trifacta and others.
The outcome: Our client can capture far more data than ever before; they ingest terabytes of financial data in a consistent and powerful format, capable of running advanced analytics that meet organizational goals.