Security

RocketML Overview

RocketML’s mission is to make data scientists and researchers more productive by making big data problems computationally tractable, by utilizing Scientific computing and HPC techniques for Machine Learning. Big data problems are evident in Life sciences, Energy, Automobile, Transportation, Cyber security and several autonomous system applications. We believe your data is your asset and we make every effort to securing and protecting your data. It is one of our most important responsibilities. We’re committed to being transparent about our security practices and helping you understand our approach.

RocketML primarily offers its software as deployed SaaS product into client’s private IT infrastructure which might be on AWS or Azure or GCP or any other cloud service provider. As such customer data never leaves clients private firewalls. This design and approach offers our customers maximum data protection.

Secondarily for “short term proof of concept projects”, RocketML offers “single tenant hosted POC” solution for specific deliverables identified in Statement of Work contract documents.

RocketML Enterprise is a foundation layer of software for building data driven intelligent software, aka Data Products. It makes getting value from raw data (be it tabular or text or image or audio) easy. As data moves from application source to data warehouse and through the various steps of machine learning model building process, the increasing surface area of information security becomes a primary concern for the IT departments who are stewards of data.

RocketML Enterprise is designed from ground-up to make the data science and machine learning process highly secure and meets most stringent regulatory, corporate compliance requirements. Our customers’ needs span wide range of complexity, from healthcare, insurance, financial services, online retail stores to government with locked-down classified environments.

Many use cases require code to run inside a private network, behind complex firewall rules, and for data to never leave specific boundaries. RocketML Enterprise is designed with a flexible infrastructure that is compatible with government-grade security requirements. This document covers controls we have in place for our primary offering – deployed SaaS. We also touch upon “single tenant hosted POC” practices where applicable.

Organizational Security

RocketML security program is based on the concept of defense in depth: securing our organization, and your data, at every layer. Our security program is aligned with ISO 27000, AICPA Trust Service Principles, and NIST standards, and is constantly evolving with updated guidance and new industry best practices. As a young startup, we are progressing towards securing external certification. We are highly security minded and we have the necessary controls in place for Security Architecture, Product Security, Security Engineering and Operations, Detection and Response, and Risk and Compliance.

Infrastructure

Physical Infrastructure (Data Centers)

RocketML’s physical infrastructure is hosted and managed within cloud service providers (AWS, Azure etc) secure data centers and utilize their services. These Cloud service providers (CSPs) continually manages risk and undergoes recurring assessments to ensure compliance with industry standards. CSPs have many years of experience in designing, constructing, and operating large-scale data centers. CSP data centers are housed in nondescript facilities, and critical facilities have extensive setback and military grade perimeter control berms as well as other natural boundary protection. Physical access is strictly controlled both at the perimeter and at building ingress points by professional security staff utilizing video surveillance, state of the art intrusion detection systems, and other electronic means. Authorized staff must pass two-factor authentication no fewer than three times to access data center floors. All visitors and contractors are required to present identification and are signed in and continually escorted by authorized staff.

CSPs only provide data center access and information to employees who have a legitimate business need for such privileges. When an employee no longer has a business need for these privileges, his or her access is immediately revoked, even if they continue to be an employee of Amazon or Amazon Web Services. All physical and electronic access to data centers by Amazon employees is logged and audited routinely.

CSP’s data center operations have been accredited under:

ISO 27001
SOC 1 and SOC 2/SSAE 16/ISAE 3402 (Previously SAS 70 Type II)
PCI Level 1
FISMA Moderate
Sarbanes-Oxley (SOX)

A detailed SOC 2 audit report related to all the services RocketML uses from CSPs is available on request. RocketML uses approximately 30-35 CSP services to deliver full functionality for both Scientific Computing as well as Big Data AI solutions. All these CSP services are SOC 2 audited and certified.

CSPs only provides data center access and information to employees who have a legitimate business need for such privileges. When an employee no longer has a business need for these privileges, his or her access is immediately revoked, even if they continue to be an employee of Amazon or Amazon Web Services. All physical and electronic access to data centers by Amazon employees is logged and audited routinely.

For additional information on AWS see: https://aws.amazon.com/security.

Network Security & Authentication

Firewalls

Firewalls are utilized to restrict access to systems from external networks and between systems internally. By default all access is denied and only explicitly allowed ports and protocols are allowed based on business need. Each system is assigned to a firewall security group based on the system’s function. Security groups restrict access to only the ports and protocols required for a system’s specific function to mitigate risk.

Authentication and access

RocketML deployed SaaS is designed to enable client preferred strong authentication and access control systems. This ensures clients meet their compliance goals. No RocketML staff will have access to this system.

On “single tenant hosted POC” authentication and access are implemented to restrict administrative access, internal support tools, and customer data. User and administration access is based on Transport Layer Security (“TLS”) certificates, which helps to positively identify the resource access requester. This service also offers transport encryption to enhance data confidentiality in transit. Operating system access is limited to RocketML staff and requires multi-factor authentication. Operating systems do not allow password authentication to prevent password brute force attacks, theft, and sharing. We do not store users’ passwords anywhere in our system.

RocketML follows a formal process to grant or revoke employee access to RocketML resources. On AWS, we use AWS Cognito with SAML authentication for invited users only. System utilizes Secure Shell (“SSH”) and TLS certificates help provide secure and flexible access mechanisms. These mechanisms are designed to grant access rights to systems and data only to authorized users.

Both user and internal access to customer data is restricted through the use of unique user IDs. Access to sensitive systems and applications requires two-factor authentication in the form of a unique user ID, strong passwords, One-Time-Passwords (“OTP”), Security Keys and/or certificates. Periodic reviews of access lists are implemented to help ensure access to customer data is appropriate and authorized. Access to production machines, network devices and support tools is managed via an access group management system. Membership in these groups must be approved by respective group administrators. User groups are annually reviewed.

RocketML uses SAML based authentication to access AWS IT infrastructure for IT staff

RocketML application comes enabled with MFA & SAML based authentication.

Change Management

Change Management policies, including security code reviews and emergency fixes, are in place, and procedures for tracking, testing, approving, and validating changes are documented. Changes are developed utilizing the code versioning tool to manage source code, documentation, release labeling and other functions. RocketML requires all code changes to be reviewed and approved by a separate technical resource, other than the developer, to evaluate the quality and accuracy of changes. Further, all application and configuration changes are tested prior to migration to production environment.

RocketML uses detection and monitoring procedures to identify (1) changes to configurations (software) that result in the introduction of new vulnerabilities, and (2) susceptibilities to newly discovered vulnerabilities.

Data Security

User files

On Deployed SaaS, our clients decide where to store their proprietary data including log files and project files. Typically most clients prefer to store them on Amazon’s S3 data storage service. Additionally for scripts and code that data scientists create, RocketML enables storing on client preferred Git repository. The RocketML platform verifies that any user asking for a file is authorized to access the requested file.

System data

We use a database to store information about users (such as username and email address) as well as metadata about Projects and Runs (e.g., timestamps and status messages). This data is stored in a database hosted within AWS as preferred by our customers.

Data at rest

Data is encrypted anytime it is “at rest” in the RocketML platform, including in our “blob store”, and on disk on the “executor” machines that run user scripts.

RocketML uses AES-256 encryption algorithm to encrypt your data on the server.

Data in transit

Data is encrypted during transit between users’ machines and different parts of the RocketML platform. We use industry-standard SSL for encrypted communication.

We enforce encryption in transit by default to AWS S3 bucket.

Access to Customer Data

RocketML staff does not access or interact with customer data or applications as part of normal operations. There may be cases where RocketML is requested to interact with customer data or applications at the request of the customer for support purposes or where required by law.

System Configuration

System configuration and consistency is maintained through standard, up-to-date images, configuration management software, and by replacing systems with updated deployments. Systems are deployed using up-to-date images that are updated with configuration changes and security updates before deployment. Once deployed, existing systems are decommissioned and replaced with up-to-date systems.

Software Engineering Methodology

Every line of code is reviewed before being deployed to our production environment. Among other things, code reviewers are trained to look for security vulnerabilities. RocketML uses agile methodology according to OWASP secure coding practices.

People

RocketML has implemented a process-based service quality environment designed to deliver the latest machine learning technology to customers. The fundamentals underlying the services provided are the adoption of standardized, repeatable processes; the hiring and development of highly skilled resources; and leading industry practices. RocketML’s repeatable process model includes key infrastructure and product related processes and controls over security, availability, process integrity, and confidentiality. Formal organizational structures exist, although flat in nature is available to employees. RocketML has developed formal policies, procedures, and job descriptions for operational areas including security administration, change management, hiring, terminations, and incident escalation. These policies and procedures have been designed to segregate duties and enforce responsibilities based on job functionality. Policies and procedures are reviewed and updated as necessary.