FAQs: Data Stream

  • Updated

What is Data Stream?

Data Stream is Demandbase’s customer-data delivery service. With this service you can have your Demandbase customer data streamed daily to your data lake, warehouse, cloud service provider, or directly to your BI tools, such as Domo, Tableau, and PowerBI. Having access to your Demandbase customer data within your own environments gives you the flexibility to enable:

  • Dashboard development
  • Integrated data science research
  • Machine Learning model development
  • Business Intelligence integration
  • Aggregated analysis with other information

With Data Stream you no longer need to complete the daily export and import tasks. All of the data is automatically delivered to you. You can store all the Demandbase data in the same place as all other data in your company, giving you a holistic view that allows you to gain insights from the combined dataset.

What data is included in Data Stream?

The dataset delivered to you is outlined in the Data Stream Specification document.

What are the delivery options?

  1. Data Warehouse Destinations: Google Cloud Platform (GCP) BigQuery, Amazon RedShift
  2. Data Storage: GCP Google Cloud Storage (GCS), Amazon S3, Microsoft Azure Blob, Demandbase Hosted SFTP
  3. BI Solutions: Tableau, Domo, PowerBI, Google Data Studio, and many more!

If most of the data is in Demandbase One, how can Data Stream be useful to me?

There are two key reasons why Data Stream is useful to your company:

More data: You will find more data in Data Stream. Data Stream contains all the data on the Demandbase One Platform and more such as account scores.
No more manual process: With Data Stream, you don’t need to complete the daily export and import tasks which can save you a lot of time. All of the data will be automatically delivered to you.  You can store all the Demandbase data in the same place as all other data in your company which gives you a holistic view and allows you to gain insights from a large dataset.

What are the advantages of Data Stream delivery over a public API for data acquisition?

Demandbase wants to start offering Data Stream with data delivery before offering a public API to make it convenient for you. Within Data Stream, there can be tens of millions of rows for each customer. Even with batch API calls, that could translate to hundreds of thousands of calls per day. API calls are not a scalable way for you to access the data. Here are a few advantages over pulling the data yourself:

  • Tens of millions of rows of data delivered daily, at-once. This saves you time because you can easily access all the data.
  • Avoids “missing” data due to operational issues.
  • Support for integrated delivery to cloud data warehouses such as BigQuery and RedShift which also provide programmatic SQL APIs  (REST, JDBC, ODBC) and human/manual SQL UIs.
  • Provides the option to deliver a large-data raw format to cloud storage for processing on-demand. Ideal for engineers and scientists to do further ETL and/or analysis with technologies like Spark and Hadoop.

Who are the ideal consumers in my company for Data Stream?

The key purpose of Data Stream is to enable customers to build their own dashboards and run ML models; therefore, the main users are data scientists, analysts and technical marketing operation teams. A good example of the persona would be a data analyst who has a solid understanding of a company’s data structure and is proficient in SQL and is also an expert in Excel and Tableau. One of the key job functions is to work with multiple stakeholders in the marketing and product teams and to collect BRD for dashboards to drive insights they want to gather.  

The Analytics and BI team will be the direct users of Data Stream data to build dashboards for their internal teams such as marketing and sales. From the Demandbase product team, we will be looking to collect two sets of feedback from the analytics team for data related feedback and the marketing team for business related feedback.

How often will Demandbase deliver data to me?

Demandbase will deliver the data package to your designated data warehouse, cloud storage, or BI tool everyday including weekends at 7 am EST which includes data from the previous day (unless you specify other times of the day.)  In the initial release, there will not be an option to change the delivery time as it takes time to collect all the data from different parts of the Demandbase One Platform and to ensure accuracy of the rollup.

How will the data be structured in the daily delivery package?

If the data is being delivered to the data warehouse such as BigQuery, you will receive the ongoing partition table. However, if you have requested the data to be delivered in Apache Parquet or CSV format, all the data will be in raw format which gives you 100% freedom to handle the data your way. 

Ongoing Partition Table: You will receive the same table everyday with additional rows containing new daily data. An extra date stamp column will be added to the table so you know when the data is entered into the table. The benefit of this is that you can use the latest table to query from instead of having to add new tables everyday into your data warehouse.

What is the format of the data?

A data package can be delivered in two forms: 

  • Into your data warehouse such as BigQuery and Redshift.
  • Into your data storage service such as GCS, S3, Azure Blob and your own data storage. When delivering into the data storage service, we deliver the package in Apache Parquet format by default, with CSV format available upon request.

Demandbase can also host the data for you to access in the Google Cloud Platform or a Hosted SFTP server.

How much storage do I need to use Data Stream?

The daily data package from Demandbase is between 5GB and 15GB in size (uncompressed). If the data is being delivered into the data warehouse, you can specify the length of the historical data you want to keep in the system. If you have a tight quota on storage utilization, you can have your system purge the data that’s older than 6 months.

Is Data Stream compatible with the data warehouse I'm using?

Data Stream will support direct delivery into GCP BigQuery, Amazon Redshift and Microsoft Cosmo DB. If you have a cloud data storage service or do not want the data to be delivered into your data warehouse, Data Stream will support delivering into folders in GCP GCS, Amazon S3 and Microsoft Azure Blob.

Are there any limitations or restrictions on the data delivery?

You can only choose one delivery mechanism which means Demandbase will not support delivering the data into S3 and also into BigQuery.

How does Demandbase deliver data if I don't use any of the data warehouses?

If you do not use any of the supported data warehouses, you will need to use any of the supported cloud storage services for the Data Stream data package to be delivered.

Can I use Data Stream if I host my own data center?

Unfortunately, Demandbase will only support data delivery to either cloud data warehouses or cloud data storage services at this point. If you do not use any of these and are interested in Data Stream, share your use case with your account manager or CSM.

Can I request additional data that is not part of Data Stream?

Yes. Since this is the first time Demandbase is allowing our customers to have access to the data directly, we are happy to work with any interested customers to make sure they can make the best out of Data Stream. If you feel the data you get from Data Stream is inadequate, reach out to your CSM or account manager who will work directly with the Demandbase product team for an ideal solution for you.

How can customers validate or preview data before making a purchase?

You can request for a trial data package to understand what is included in Data Stream and how you can use the data on a daily basis. For the trial request, we can provide a 1-day data delivery. Demandbase will need to collect information needed for us to set up the trial which includes the data warehouse or storage service to deliver the data to and the date to start the delivery.

How do I get started with Data Stream?

If you are interested in Data Stream, reach out to your account manager and CSM who will put you in touch with the product team to make sure you have the right setup to make the best of Data Stream. The product team at Demandbase will work directly with you to answer any questions you may have.  There are some configuration options you will need to select before we can get the data delivery set up for you. Once you are confirmed to move forward with Data Stream, you will work with your Demandbase account manager to have the order form processed and signed. You will continue to receive support from the product team if you need additional help.

How often is the data updated?

The data you receive everyday will be up to 12 am EST (5 am UTC time) of the same day. Since the data is generated daily, you might see some information discrepancies between the data you just received and the data you see in Demandbase One Platform at 12 pm EST, because Data Stream does not include the data in that 12-hour gap.

What are the configuration options for using Data Stream?

To properly and promptly deliver data to you, there is some information we need to collect from you. The format of how we collect the information can just be an Excel form sent to you from the Demandbase product team. We will also need a technical contact to work with at your company because of the complexity of the product. The following is some of the data we would need from you:

  • Trial or not
  • Account name & ID
  • Contact person’s name & title
  • Contact person’s email & phone number
  • Technical contact name if different from contact
  • Technical contact email & phone number if different from contact
  • Date to start the delivery
  • Delivery destination: BigQuery, Redshift, Azure Synapse, GCS, S3, Azure Blob, Demandbase Hosted SFTP, Other
  • Other delivery information

How can I be sure the data is accurate and that it matches the data in Demandbase One?

The data from Data Stream is from the same database which is feeding the information to the Demandbase One Platform; therefore, there should not be any data discrepancy between those two.  However, since Data Stream is delivered daily and the data is cut off at the end of the day, you will not have the data of the same day within Data Stream which might cause a difference in data.

How can Data Stream help my operation in targeting and engagement on Demandbase?

Dashboards: By using Data Stream, you can build your own dashboard or data warehouse with the data you get from multiple sources. 

Insights: You can drive a new set of insights about your customers and campaigns. 

  • Apply to account lists and campaigns: You can then use that information to come back to the Demandbase One Platform to apply to an existing/new account list and campaign for better performance. 
  • Keyword Sets: For example, you may find out there are different sets of keywords you can apply to the account list(s)  that can more effectively target your ad campaigns.

How do I measure how much using Data Stream helps me?

There are two key improvements with Data Stream that can help your company: Time and Insights.

With Data Stream, you can save time from the manual process of daily exporting and importing data through automated delivery. The other improvement is the amount of insights you can get from Data Stream. Since different teams could be using the data in their own way, you can measure the impact of using Data Stream by the number of new active dashboards and reports created using Data Stream data. 

If you want to measure in a more granular level, you can have a good idea of the amount of improvements by evaluating the metrics tied to those new dashboards. If data is being used to generate recommendations for the sales team, you can measure how effective those recommendations are and how much revenue the recommendations lead to.

Was this article helpful?

0 out of 0 found this helpful