Associate-Data-Practitioner Practice Exam Tests Latest Updated on Sep-2025
Pass Associate-Data-Practitioner Exam in First Attempt Guaranteed Dumps!
Google Associate-Data-Practitioner Exam Syllabus Topics:
| Topic | Details |
|---|---|
| Topic 1 |
|
| Topic 2 |
|
| Topic 3 |
|
NEW QUESTION # 10
You are working with a large dataset of customer reviews stored in Cloud Storage. The dataset contains several inconsistencies, such as missing values, incorrect data types, and duplicate entries. You need to clean the data to ensure that it is accurate and consistent before using it for analysis. What should you do?
- A. Use Cloud Run functions to clean the data and load it into BigQuery. Use SQL for analysis.
- B. Use Storage Transfer Service to move the data to a different Cloud Storage bucket. Use event triggers to invoke Cloud Run functions to load the data into BigQuery. Use SQL for analysis.
- C. Use BigQuery to batch load the data into BigQuery. Use SQL for cleaning and analysis.
- D. Use the PythonOperator in Cloud Composer to clean the data and load it into BigQuery. Use SQL for analysis.
Answer: C
Explanation:
Using BigQuery to batch load the data and perform cleaning and analysis with SQL is the best approach for this scenario. BigQuery provides powerful SQL capabilities to handle missing values, enforce correct data types, and remove duplicates efficiently. This method simplifies the pipeline by leveraging BigQuery's built-in processing power for both cleaning and analysis, reducing the need for additional tools or services and minimizing complexity.
NEW QUESTION # 11
Your organization has a BigQuery dataset that contains sensitive employee information such as salaries and performance reviews. The payroll specialist in the HR department needs to have continuous access to aggregated performance data, but they do not need continuous access to other sensitive dat a. You need to grant the payroll specialist access to the performance data without granting them access to the entire dataset using the simplest and most secure approach. What should you do?
- A. Create a SQL query with the aggregated performance data. Export the results to an Avro file in a Cloud Storage bucket. Share the bucket with the payroll specialist.
- B. Create row-level and column-level permissions and policies on the table that contains performance data in the dataset. Provide the payroll specialist with the appropriate permission set.
- C. Use authorized views to share query results with the payroll specialist.
- D. Create a table with the aggregated performance data. Use table-level permissions to grant access to the payroll specialist.
Answer: C
Explanation:
Using authorized views is the simplest and most secure way to grant the payroll specialist access to aggregated performance data without exposing the entire dataset. Authorized views allow you to create a view in BigQuery that contains only the query results for the aggregated performance data. The payroll specialist can query the view without being granted access to the underlying sensitive data. This approach ensures security, adheres to the principle of least privilege, and eliminates the need to manage complex row-level or column-level permissions.
NEW QUESTION # 12
You are a data analyst working with sensitive customer data in BigQuery. You need to ensure that only authorized personnel within your organization can query this data, while following the principle of least privilege. What should you do?
- A. Enable access control by using IAM roles.
- B. Encrypt the data by using customer-managed encryption keys (CMEK).
- C. Update dataset privileges by using the SQL GRANT statement.
- D. Export the data to Cloud Storage, and use signed URLs to authorize access.
Answer: D
Explanation:
Using IAM roles to enable access control in BigQuery is the best approach to ensure that only authorized personnel can query the sensitive customer data. IAM allows you to define granular permissions at the project, dataset, or table level, ensuring that users have only the access they need in accordance with the principle of least privilege. For example, you can assign roles like roles/bigquery.dataViewer to allow read-only access or roles/bigquery.dataEditor for more advanced permissions. This approach provides centralized and manageable access control, which is critical for protecting sensitive data.
NEW QUESTION # 13
Your retail company collects customer data from various sources:
Online transactions: Stored in a MySQL database
Customer feedback: Stored as text files on a company server
Social media activity: Streamed in real-time from social media platforms
You are designing a data pipeline to extract this data. Which Google Cloud storage system(s) should you select for further analysis and ML model training?
- A. 1. Online transactions: Cloud Storage
2. Customer feedback: Cloud Storage
3. Social media activity: Cloud Storage - B. 1. Online transactions: Bigtable
2. Customer feedback: Cloud Storage
3. Social media activity: CloudSQL for MySQL - C. 1. Online transactions: BigQuery
2. Customer feedback: Cloud Storage
3. Social media activity: BigQuery - D. 1. Online transactions: Cloud SQL for MySQL
2. Customer feedback: BigQuery
3. Social media activity: Cloud Storage
Answer: C
Explanation:
Online transactions:Storing the transactional data inBigQueryis ideal because BigQuery is a serverless data warehouse optimized for querying and analyzing structured data at scale. It supports SQL queries and is suitable for structured transactional data.
Customer feedback:Storing customer feedback inCloud Storageis appropriate as it allows you to store unstructured text files reliably and at a low cost. Cloud Storage also integrates well with data processing and ML tools for further analysis.
Social media activity:Storing real-time social media activity inBigQueryis optimal because BigQuery supports streaming inserts, enabling real-time ingestion and analysis of data. This allows immediate analysis and integration into dashboards or ML pipelines.
NEW QUESTION # 14
Your company uses Looker as its primary business intelligence platform. You want to use LookML to visualize the profit margin for each of your company's products in your Looker Explores and dashboards.
You need to implement a solution quickly and efficiently. What should you do?
- A. Apply a filter to only show products with a positive profit margin.
- B. Create a new dimension that categorizes products based on their profit margin ranges (e.g., high, medium, low).
- C. Define a new measure that calculates the profit margin by using the existing revenue and cost fields.
- D. Create a derived table that pre-calculates the profit margin for each product, and include it in the Looker model.
Answer: C
Explanation:
Defining a newmeasurein LookML to calculate the profit margin using the existing revenue and cost fields is the most efficient and straightforward solution. This approach allows you to dynamically compute the profit margin directly within your Looker Explores and dashboards without needing to pre-calculate or create additional tables. The measure can be defined using LookML syntax, such as:
measure: profit_margin {
type: number
sql: (revenue - cost) / revenue ;;
value_format: "0.0%"
}
This method is quick to implement and integrates seamlessly into your existing Looker model, enabling accurate visualization of profit margins across your products.
NEW QUESTION # 15
Your retail company collects customer data from various sources:
You are designing a data pipeline to extract this dat
a. Which Google Cloud storage system(s) should you select for further analysis and ML model training?
- A. 1. Online transactions: Cloud Storage
2. Customer feedback: Cloud Storage
3. Social media activity: Cloud Storage - B. 1. Online transactions: Bigtable
2. Customer feedback: Cloud Storage
3. Social media activity: CloudSQL for MySQL - C. 1. Online transactions: BigQuery
2. Customer feedback: Cloud Storage
3. Social media activity: BigQuery - D. 1. Online transactions: Cloud SQL for MySQL
2. Customer feedback: BigQuery
3. Social media activity: Cloud Storage
Answer: C
Explanation:
Online transactions: Storing the transactional data in BigQuery is ideal because BigQuery is a serverless data warehouse optimized for querying and analyzing structured data at scale. It supports SQL queries and is suitable for structured transactional data.
Customer feedback: Storing customer feedback in Cloud Storage is appropriate as it allows you to store unstructured text files reliably and at a low cost. Cloud Storage also integrates well with data processing and ML tools for further analysis.
Social media activity: Storing real-time social media activity in BigQuery is optimal because BigQuery supports streaming inserts, enabling real-time ingestion and analysis of data. This allows immediate analysis and integration into dashboards or ML pipelines.
NEW QUESTION # 16
You are a Looker analyst. You need to add a new field to your Looker report that generates SQL that will run against your company's database. You do not have the Develop permission. What should you do?
- A. Create a calculated field using the Add a field option in Looker Studio, and add it to your report.
- B. Create a table calculation from the field picker in Looker, and add it to your report.
- C. Create a custom field from the field picker in Looker, and add it to your report.
- D. Create a new field in the LookML layer, refresh your report, and select your new field from the field picker.
Answer: C
Explanation:
Creating a custom field from the field picker in Looker allows you to add new fields to your report without requiring the Develop permission. Custom fields are created directly in the Looker UI, enabling you to define calculations or transformations that generate SQL for the database query. This approach is user-friendly and does not require access to the LookML layer, making it the appropriate choice for your situation.
NEW QUESTION # 17
Your organization uses a BigQuery table that is partitioned by ingestion time. You need to remove data that is older than one year to reduce your organization's storage costs. You want to use the most efficient approach while minimizing cost. What should you do?
- A. Create a view that filters out rows that are older than one year.
- B. Require users to specify a partition filter using the alter table statement in SQL.
- C. Set the table partition expiration period to one year using the ALTER TABLE statement in SQL.
- D. Create a scheduled query that periodically runs an update statement in SQL that sets the "deleted" column to "yes" for data that is more than one year old. Create a view that filters out rows that have been marked deleted.
Answer: C
Explanation:
Setting the table partition expiration period to one year using the ALTER TABLE statement is the most efficient and cost-effective approach. This automatically deletes data in partitions older than one year, reducing storage costs without requiring manual intervention or additional queries. It minimizes administrative overhead and ensures compliance with your data retention policy while optimizing storage usage in BigQuery.
NEW QUESTION # 18
Your retail company wants to predict customer churn using historical purchase data stored in BigQuery. The dataset includes customer demographics, purchase history, and a label indicating whether the customer churned or not. You want to build a machine learning model to identify customers at risk of churning. You need to create and train a logistic regression model for predicting customer churn, using the customer_data table with the churned column as the target label. Which BigQuery ML query should you use?
- A.

- B.

- C.

- D.

Answer: A
Explanation:
In BigQuery ML, when creating a logistic regression model to predict customer churn, the correct query should:
Exclude the target label column (in this case, churned) from the feature columns, as it is used for training and not as a feature input.
Rename the target label column to label, as BigQuery ML requires the target column to be named label.
The chosen query satisfies these requirements:
SELECT * EXCEPT(churned), churned AS label: Excludes churned from features and renames it to label.
The OPTIONS(model_type='logistic_reg') specifies that a logistic regression model is being trained.
This setup ensures the model is correctly trained using the features in the dataset while targeting the churned column for predictions.
NEW QUESTION # 19
You work for a financial organization that stores transaction data in BigQuery. Your organization has a regulatory requirement to retain data for a minimum of seven years for auditing purposes. You need to ensure that the data is retained for seven years using an efficient and cost-optimized approach. What should you do?
- A. Create a partition by transaction date, and set the partition expiration policy to seven years.
- B. Set the table-level retention policy in BigQuery to seven years.
- C. Export the BigQuery tables to Cloud Storage daily, and enforce a lifecycle management policy that has a seven-year retention rule.
- D. Set the dataset-level retention policy in BigQuery to seven years.
Answer: B
Explanation:
Setting a table-level retention policy in BigQuery to seven years is the most efficient and cost-optimized solution to meet the regulatory requirement. A table-level retention policy ensures that the data cannot be deleted or overwritten before the specified retention period expires, providing compliance with auditing requirements while keeping the data within BigQuery for easy access and analysis. This approach avoids the complexity and additional costs of exporting data to Cloud Storage.
NEW QUESTION # 20
Your organization plans to move their on-premises environment to Google Cloud. Your organization's network bandwidth is less than 1 Gbps. You need to move over 500 ## of data to Cloud Storage securely, and only have a few days to move the data. What should you do?
- A. Request multiple Transfer Appliances, copy the data to the appliances, and ship the appliances back to Google Cloud to upload the data to Cloud Storage.
- B. Connect to Google Cloud using Dedicated Interconnect. Use the gcloud storage command to move the data to Cloud Storage.
- C. Connect to Google Cloud using VPN. Use the gcloud storage command to move the data to Cloud Storage.
- D. Connect to Google Cloud using VPN. Use Storage Transfer Service to move the data to Cloud Storage.
Answer: A
Explanation:
UsingTransfer Appliancesis the best solution for securely and efficiently moving over 500 TB of data to Cloud Storage within a limited timeframe, especially with network bandwidth below 1 Gbps. Transfer Appliances are physical devices provided by Google Cloud to securely transfer large amounts of data. After copying the data to the appliances, they are shipped back to Google, where the data is uploaded to Cloud Storage. This approach bypasses bandwidth limitations and ensures the data is migrated quickly and securely.
NEW QUESTION # 21
You need to create a new data pipeline. You want a serverless solution that meets the following requirements:
* Data is streamed from Pub/Sub and is processed in real-time.
* Data is transformed before being stored.
* Data is stored in a location that will allow it to be analyzed with SQL using Looker.
Which Google Cloud services should you recommend for the pipeline?
- A. 1. Cloud Composer
2. Cloud SQL for MySQL - B. 1. Dataflow
2. BigQuery - C. 1. Dataproc Serverless
2. Bigtable - D. 1. BigQuery
2. Analytics Hub
Answer: B
Explanation:
To build a serverless data pipeline that processes data in real-time from Pub/Sub, transforms it, and stores it for SQL-based analysis using Looker, the best solution is to use Dataflow and BigQuery. Dataflow is a fully managed service for real-time data processing and transformation, while BigQuery is a serverless data warehouse that supports SQL-based querying and integrates seamlessly with Looker for data analysis and visualization. This combination meets the requirements for real-time streaming, transformation, and efficient storage for analytical queries.
NEW QUESTION # 22
You have a Cloud SQL for PostgreSQL database that stores sensitive historical financial data. You need to ensure that the data is uncorrupted and recoverable in the event that the primary region is destroyed. The data is valuable, so you need to prioritize recovery point objective (RPO) over recovery time objective (RTO). You want to recommend a solution that minimizes latency for primary read and write operations. What should you do?
- A. Configure the Cloud SQL for PostgreSQL instance for regional availability (HA). Back up the Cloud SQL for PostgreSQL database hourly to a Cloud Storage bucket in a different region.
- B. Configure the Cloud SQL for PostgreSQL instance for multi-region backup locations.
- C. Configure the Cloud SQL for PostgreSQL instance for regional availability (HA) with asynchronous replication to a secondary instance in a different region.
- D. Configure the Cloud SQL for PostgreSQL instance for regional availability (HA) with synchronous replication to a secondary instance in a different zone.
Answer: D
Explanation:
Comprehensive and Detailed in Depth Explanation:
Why D is correct:Synchronous replication ensures that data is written to both the primary and secondary instances at the same time, minimizing data loss (RPO).
Regional availability (HA) within different zones provides redundancy within the same region, minimizing latency.
Why other options are incorrect:A: Asynchronous replication has a potential for data loss.
B: Multiregion backups are for disaster recovery, not minimizing latency.
C: Hourly backups do not provide the lowest possible RPO.
NEW QUESTION # 23
Your company is building a near real-time streaming pipeline to process JSON telemetry data from small appliances. You need to process messages arriving at a Pub/Sub topic, capitalize letters in the serial number field, and write results to BigQuery. You want to use a managed service and write a minimal amount of code for underlying transformations. What should you do?
- A. Use a Pub/Sub to BigQuery subscription, write results directly to BigQuery, and schedule a transformation query to run every five minutes.
- B. Use a Pub/Sub push subscription, write a Cloud Run service that accepts the messages, performs the transformations, and writes the results to BigQuery.
- C. Use a Pub/Sub to Cloud Storage subscription, write a Cloud Run service that is triggered when objects arrive in the bucket, performs the transformations, and writes the results to BigQuery.
- D. Use the "Pub/Sub to BigQuery" Dataflow template with a UDF, and write the results to BigQuery.
Answer: D
Explanation:
Using the "Pub/Sub to BigQuery" Dataflow template with a UDF (User-Defined Function) is the optimal choice because it combines near real-time processing, minimal code for transformations, and scalability. The UDF allows for efficient implementation of custom transformations, such as capitalizing letters in the serial number field, while Dataflow handles the rest of the managed pipeline seamlessly.
NEW QUESTION # 24
You need to transfer approximately 300 TB of data from your company's on-premises data center to Cloud Storage. You have 100 Mbps internet bandwidth, and the transfer needs to be completed as quickly as possible. What should you do?
- A. Use Cloud Client Libraries to transfer the data over the internet.
- B. Use the gcloud storage command to transfer the data over the internet.
- C. Compress the data, upload it to multiple cloud storage providers, and then transfer the data to Cloud Storage.
- D. Request a Transfer Appliance, copy the data to the appliance, and ship it back to Google.
Answer: D
Explanation:
Comprehensive and Detailed In-Depth Explanation:
Transferring 300 TB over a 100 Mbps connection would take an impractical amount of time (over 300 days at theoretical maximum speed, ignoring real-world constraints like latency). Google Cloud provides the Transfer Appliance for large-scale, time-sensitive transfers.
* Option A: Cloud Client Libraries over the internet would be slow and unreliable for 300 TB due to bandwidth limitations.
* Option B: The gcloud storage command is similarly constrained by internet speed and not designed for such large transfers.
* Option C: Compressing and splitting across multiple providers adds complexity and isn't a Google- supported method for Cloud Storage ingestion.
NEW QUESTION # 25
......
Google Cloud Platform Free Certification Exam Material from PassTorrent with 108 Questions: https://freepdf.passtorrent.com/Associate-Data-Practitioner-latest-torrent.html