
Pass Amazon AWS Certified Data Analytics - Specialty (DAS-C01) Exam Exam in First Attempt Guaranteed Updated Dump from TorrentExam!
Pass AWS-Certified-Data-Analytics-Specialty Exam with 159 Questions - Verified By TorrentExam
Amazon DAS-C01 Exam is intended for data analysts, data engineers, and data scientists who work with AWS services for data analysis. AWS-Certified-Data-Analytics-Specialty exam covers a wide range of topics, including data collection, storage, processing, and analysis. Candidates must possess a solid understanding of AWS services such as Amazon S3, Amazon Kinesis, Amazon EMR, and Amazon Redshift. They must also be able to use AWS services to build and deploy data models, as well as to visualize and communicate insights from data.
NEW QUESTION # 29
A financial company hosts a data lake in Amazon S3 and a data warehouse on an Amazon Redshift cluster.
The company uses Amazon QuickSight to build dashboards and wants to secure access from its on-premises Active Directory to Amazon QuickSight.
How should the data be secured?
- A. Use a VPC endpoint to connect to Amazon S3 from Amazon QuickSight and an IAM role to authenticate Amazon Redshift.
- B. Use an Active Directory connector and single sign-on (SSO) in a corporate network environment.
- C. Place Amazon QuickSight and Amazon Redshift in the security group and use an Amazon S3 endpoint to connect Amazon QuickSight to Amazon S3.
- D. Establish a secure connection by creating an S3 endpoint to connect Amazon QuickSight and a VPC endpoint to connect to Amazon Redshift.
Answer: A
NEW QUESTION # 30
A mobile gaming company wants to capture data from its gaming app and make the data available for analysis immediately. The data record size will be approximately 20 KB. The company is concerned about achieving optimal throughput from each device. Additionally, the company wants to develop a data stream processing application with dedicated throughput for each consumer.
Which solution would achieve this goal?
- A. Have the app use Amazon Kinesis Producer Library (KPL) to send data to Kinesis Data Firehose. Use the enhanced fan-out feature while consuming the data.
- B. Have the app call the PutRecordBatch API to send data to Amazon Kinesis Data Firehose. Submit a support case to enable dedicated throughput on the account.
- C. Have the app call the PutRecords API to send data to Amazon Kinesis Data Streams. Host the stream- processing application on Amazon EC2 with Auto Scaling.
- D. Have the app call the PutRecords API to send data to Amazon Kinesis Data Streams. Use the enhanced fan-out feature while consuming the data.
Answer: C
NEW QUESTION # 31
A bank wants to migrate a Teradata data warehouse to the AWS Cloud The bank needs a solution for reading large amounts of data and requires the highest possible performance. The solution also must maintain the separation of storage and compute Which solution meets these requirements?
- A. Use Amazon Redshift with RA3 nodes to query the data in Amazon Redshift managed storage
- B. Use Amazon Athena to query the data in Amazon S3
- C. Use Amazon Redshift with dense compute nodes to query the data in Amazon Redshift managed storage
- D. Use PrestoDB on Amazon EMR to query the data in Amazon S3
Answer: A
NEW QUESTION # 32
A company is planning to do a proof of concept for a machine earning (ML) project using Amazon SageMaker with a subset of existing on-premises data hosted in the company's 3 TB data warehouse. For part of the project, AWS Direct Connect is established and tested. To prepare the data for ML, data analysts are performing data curation. The data analysts want to perform multiple step, including mapping, dropping null fields, resolving choice, and splitting fields. The company needs the fastest solution to curate the data for this project.
Which solution meets these requirements?
- A. Ingest data into Amazon S3 using AWS DataSync and use Apache Spark scrips to curate the data in an Amazon EMR cluster. Store the curated data in Amazon S3 for ML processing.
- B. Ingest data into Amazon S3 using AWS DMS. Use AWS Glue to perform data curation and store the data in Amazon 3 for ML processing.
- C. Create custom ETL jobs on-premises to curate the data. Use AWS DMS to ingest data into Amazon S3 for ML processing.
- D. Take a full backup of the data store and ship the backup files using AWS Snowball. Upload Snowball data into Amazon S3 and schedule data curation jobs using AWS Batch to prepare the data for ML.
Answer: B
NEW QUESTION # 33
A company is sending historical datasets to Amazon S3 for storage. A data engineer at the company wants to make these datasets available for analysis using Amazon Athena. The engineer also wants to encrypt the Athena query results in an S3 results location by using AWS solutions for encryption. The requirements for encrypting the query results are as follows:
Use custom keys for encryption of the primary dataset query results.
Use generic encryption for all other query results.
Provide an audit trail for the primary dataset queries that shows when the keys were used and by whom.
Which solution meets these requirements?
- A. Use server-side encryption with S3 managed encryption keys (SSE-S3) for the primary dataset. Use SSE-S3 for the other datasets.
- B. Use server-side encryption with AWS KMS managed customer master keys (SSE-KMS CMKs) for the primary dataset. Use server-side encryption with S3 managed encryption keys (SSE-S3) for the other datasets.
- C. Use server-side encryption with customer-provided encryption keys (SSE-C) for the primary dataset.
Use server-side encryption with S3 managed encryption keys (SSE-S3) for the other datasets. - D. Use client-side encryption with AWS Key Management Service (AWS KMS) customer managed keys for the primary dataset. Use S3 client-side encryption with client-side keys for the other datasets.
Answer: A
NEW QUESTION # 34
A reseller that has thousands of AWS accounts receives AWS Cost and Usage Reports in an Amazon S3 bucket The reports are delivered to the S3 bucket in the following format
<examp/e-reporT-prefix>/<examp/e-report-rtame>/yyyymmdd-yyyymmdd/<examp/e-report-name> parquet An AWS Glue crawler crawls the S3 bucket and populates an AWS Glue Data Catalog with a table Business analysts use Amazon Athena to query the table and create monthly summary reports for the AWS accounts The business analysts are experiencing slow queries because of the accumulation of reports from the last 5 years The business analysts want the operations team to make changes to improve query performance Which action should the operations team take to meet these requirements?
- A. Change the file format to csv.zip.
- B. Partition the data by date and account ID
- C. Partition the data by month and account ID
- D. Partition the data by account ID, year, and month
Answer: B
NEW QUESTION # 35
An Amazon Redshift database contains sensitive user data. Logging is necessary to meet compliance requirements. The logs must contain database authentication attempts, connections, and disconnections. The logs must also contain each query run against the database and record which database user ran each query.
Which steps will create the required logs?
- A. Enable and download audit reports from AWS Artifact.
- B. Enable Amazon Redshift Enhanced VPC Routing. Enable VPC Flow Logs to monitor traffic.
- C. Enable audit logging for Amazon Redshift using the AWS Management Console or the AWS CLI.
- D. Allow access to the Amazon Redshift database using AWS IAM only. Log access using AWS CloudTrail.
Answer: C
NEW QUESTION # 36
A financial company uses Amazon S3 as its data lake and has set up a data warehouse using a multi-node Amazon Redshift cluster. The data files in the data lake are organized in folders based on the data source of each data file. All the data files are loaded to one table in the Amazon Redshift cluster using a separate COPY command for each data file location. With this approach, loading all the data files into Amazon Redshift takes a long time to complete. Users want a faster solution with little or no increase in cost while maintaining the segregation of the data files in the S3 data lake.
Which solution meets these requirements?
- A. Use Amazon EMR to copy all the data files into one folder and issue a COPY command to load the data into Amazon Redshift.
- B. Create a manifest file that contains the data file locations and issue a COPY command to load the data into Amazon Redshift.
- C. Use an AWS Glue job to copy all the data files into one folder and issue a COPY command to load the data into Amazon Redshift.
- D. Load all the data files in parallel to Amazon Aurora, and run an AWS Glue job to load the data into Amazon Redshift.
Answer: A
NEW QUESTION # 37
A company uses Amazon Elasticsearch Service (Amazon ES) to store and analyze its website clickstream dat a. The company ingests 1 TB of data daily using Amazon Kinesis Data Firehose and stores one day's worth of data in an Amazon ES cluster.
The company has very slow query performance on the Amazon ES index and occasionally sees errors from Kinesis Data Firehose when attempting to write to the index. The Amazon ES cluster has 10 nodes running a single index and 3 dedicated master nodes. Each data node has 1.5 TB of Amazon EBS storage attached and the cluster is configured with 1,000 shards. Occasionally, JVMMemoryPressure errors are found in the cluster logs.
Which solution will improve the performance of Amazon ES?
- A. Increase the memory of the Amazon ES master nodes.
- B. Decrease the number of Amazon ES data nodes.
- C. Increase the number of Amazon ES shards for the index.
- D. Decrease the number of Amazon ES shards for the index.
Answer: D
Explanation:
https://aws.amazon.com/premiumsupport/knowledge-center/high-jvm-memory-pressure-elasticsearch/
NEW QUESTION # 38
A marketing company is using Amazon EMR clusters for its workloads. The company manually installs third- party libraries on the clusters by logging in to the master nodes. A data analyst needs to create an automated solution to replace the manual process.
Which options can fulfill these requirements? (Choose two.)
- A. Launch an Amazon EC2 instance with Amazon Linux and install the required third-party libraries on the instance. Create an AMI and use that AMI to create the EMR cluster.
- B. Use an Amazon DynamoDB table to store the list of required applications. Trigger an AWS Lambda function with DynamoDB Streams to install the software.
- C. Install the required third-party libraries in the existing EMR master node. Create an AMI out of that master node and use that custom AMI to re-create the EMR cluster.
- D. Place the required installation scripts in Amazon S3 and execute them through Apache Spark in Amazon EMR.
- E. Place the required installation scripts in Amazon S3 and execute them using custom bootstrap actions.
Answer: A,E
Explanation:
Explanation
https://aws.amazon.com/about-aws/whats-new/2017/07/amazon-emr-now-supports-launching-clusters-with-custo
https://docs.aws.amazon.com/de_de/emr/latest/ManagementGuide/emr-plan-bootstrap.html
NEW QUESTION # 39
A data analyst is using Amazon QuickSight for data visualization across multiple datasets generated by applications. Each application stores files within a separate Amazon S3 bucket. AWS Glue Data Catalog is used as a central catalog across all application data in Amazon S3. A new application stores its data within a separate S3 bucket. After updating the catalog to include the new application data source, the data analyst created a new Amazon QuickSight data source from an Amazon Athena table, but the import into SPICE failed.
How should the data analyst resolve the issue?
- A. Edit the permissions for the new S3 bucket from within the S3 console.
- B. Edit the permissions for the AWS Glue Data Catalog from within the AWS Glue console.
- C. Edit the permissions for the new S3 bucket from within the Amazon QuickSight console.
- D. Edit the permissions for the AWS Glue Data Catalog from within the Amazon QuickSight console.
Answer: C
NEW QUESTION # 40
Once a month, a company receives a 100 MB .csv file compressed with gzip. The file contains 50,000 property listing records and is stored in Amazon S3 Glacier. The company needs its data analyst to query a subset of the data for a specific vendor.
What is the most cost-effective solution?
- A. Query the data from Amazon S3 Glacier directly with Amazon Glacier Select.
- B. Load the data to Amazon S3 and query it with Amazon Redshift Spectrum.
- C. Load the data into Amazon S3 and query it with Amazon S3 Select.
- D. Load the data to Amazon S3 and query it with Amazon Athena.
Answer: D
NEW QUESTION # 41
A company stores its sales and marketing data that includes personally identifiable information (PII) in Amazon S3. The company allows its analysts to launch their own Amazon EMR cluster and run analytics reports with the data. To meet compliance requirements, the company must ensure the data is not publicly accessible throughout this process. A data engineer has secured Amazon S3 but must ensure the individual EMR clusters created by the analysts are not exposed to the public internet.
Which solution should the data engineer to meet this compliance requirement with LEAST amount of effort?
- A. Enable the block public access setting for Amazon EMR at the account level before any EMR cluster is created.
- B. Check the security group of the EMR clusters regularly to ensure it does not allow inbound traffic from IPv4 0.0.0.0/0 or IPv6 ::/0.
- C. Use AWS WAF to block public internet access to the EMR clusters across the board.
- D. Create an EMR security configuration and ensure the security configuration is associated with the EMR clusters when they are created.
Answer: A
Explanation:
Explanation
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-block-public-access.html
NEW QUESTION # 42
A marketing company is storing its campaign response data in Amazon S3. A consistent set of sources has generated the data for each campaign. The data is saved into Amazon S3 as .csv files. A business analyst will use Amazon Athena to analyze each campaign's data. The company needs the cost of ongoing data analysis with Athena to be minimized.
Which combination of actions should a data analytics specialist take to meet these requirements? (Choose two.)
- A. Convert the .csv files to Apache Parquet.
- B. Compress the .csv files.
- C. Partition the data by source.
- D. Convert the .csv files to Apache Avro.
- E. Partition the data by campaign.
Answer: A,E
Explanation:
Explanation
https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/
NEW QUESTION # 43
A company's data analyst needs to ensure that queries executed in Amazon Athena cannot scan more than a prescribed amount of data for cost control purposes. Queries that exceed the prescribed threshold must be canceled immediately.
What should the data analyst do to achieve this?
- A. Configure Athena to invoke an AWS Lambda function that terminates queries when the prescribed threshold is crossed.
- B. Enforce the prescribed threshold on all Amazon S3 bucket policies
- C. For each workgroup, set the workgroup-wide data usage control limit to the prescribed threshold.
- D. For each workgroup, set the control limit for each query to the prescribed threshold.
Answer: D
Explanation:
Explanation
https://docs.aws.amazon.com/athena/latest/ug/manage-queries-control-costs-with-workgroups.html
NEW QUESTION # 44
An operations team notices that a few AWS Glue jobs for a given ETL application are failing. The AWS Glue jobs read a large number of small JSON files from an Amazon S3 bucket and write the data to a different S3 bucket in Apache Parquet format with no major transformations. Upon initial investigation, a data engineer notices the following error message in the History tab on the AWS Glue console: "Command Failed with Exit Code 1." Upon further investigation, the data engineer notices that the driver memory profile of the failed jobs crosses the safe threshold of 50% usage quickly and reaches 90-95% soon after. The average memory usage across all executors continues to be less than 4%.
The data engineer also notices the following error while examining the related Amazon CloudWatch Logs.
What should the data engineer do to solve the failure in the MOST cost-effective way?
- A. Modify the AWS Glue ETL code to use the 'groupFiles': 'inPartition' feature.
- B. Change the worker type from Standard to G.2X.
- C. Increase the fetch size setting by using AWS Glue dynamics frame.
- D. Modify maximum capacity to increase the total maximum data processing units (DPUs) used.
Answer: A
Explanation:
Explanation
https://docs.aws.amazon.com/glue/latest/dg/monitor-profile-debug-oom-abnormalities.html#monitor-debug-oom
NEW QUESTION # 45
A company hosts an on-premises PostgreSQL database that contains historical dat a. An internal legacy application uses the database for read-only activities. The company's business team wants to move the data to a data lake in Amazon S3 as soon as possible and enrich the data for analytics.
The company has set up an AWS Direct Connect connection between its VPC and its on-premises network. A data analytics specialist must design a solution that achieves the business team's goals with the least operational overhead.
Which solution meets these requirements?
- A. Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises database. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet format. Create an Amazon Redshift cluster and use Amazon Redshift Spectrum to query the data.
- B. Create an Amazon RDS for PostgreSQL database and use AWS Database Migration Service (AWS DMS) to migrate the data into Amazon RDS. Use AWS Data Pipeline to copy and enrich the data from the Amazon RDS for PostgreSQL table and move the data to Amazon S3. Use Amazon Athena to query the data.
- C. Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises database. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet format. Use Amazon Athena to query the data.
- D. Upload the data from the on-premises PostgreSQL database to Amazon S3 by using a customized batch upload process. Use the AWS Glue crawler to catalog the data in Amazon S3. Use an AWS Glue job to enrich and store the result in a separate S3 bucket in Apache Parquet format. Use Amazon Athena to query the data.
Answer: B
NEW QUESTION # 46
......
The AWS Certified Data Analytics - Specialty (DAS-C01) Exam is a certification exam offered by Amazon Web Services (AWS) that focuses on assessing the skills and knowledge of data professionals in designing and implementing AWS services to derive insights from data. AWS-Certified-Data-Analytics-Specialty exam is designed to test the candidate's ability to use AWS services for data analysis, as well as their ability to understand and optimize data input and output.
Penetration testers simulate AWS-Certified-Data-Analytics-Specialty exam: https://actualtests.torrentexam.com/AWS-Certified-Data-Analytics-Specialty-exam-latest-torrent.html

