Questions tagged [qubole]

Qubole Data Service (QDS) is cloud Big Data service running on an elastic Hadoop-based cluster

Source Creators of Facebook’s Big Data infrastructure and Apache Hive have leveraged their experience to deliver Qubole Data Service (QDS) – a cloud Big Data service offering the same advanced capabilities used by Big Data savvy organizations.

Minimize operational interaction and provide your data analysts with an easy to use graphical interface, built-in connectors, and seamless, elastic cloud infrastructure.

Your Hadoop cluster is ready within minutes post signup, letting you focus on building sophisticated data pipelines, running queries, scheduling jobs and monetizing your big data.

An auto-scaling cluster, improved I/O optimization, faster queries and support for hybrid pricing - realize cost savings of as much as 50%-60% in total, while accomplishing tasks faster.

82 questions

votes

2 answers

Implement case class inside a class

I am using the below code to run in Qubole Notebook and the code is running successfully. case class cls_Sch(Id:String, Name:String) class myClass { implicit val sparkSession =…

asked Jul 10 '19 at 14:17

Sarath KS

15,816
9
67
77

votes

1 answer

retrieve size of data copied with hadoop distcp

I am running a hadoop distcp command as below: hadoop distcp src-loc target-loc I want to know the size of the data copied by running this command. I am planning to run the command on Qubole. Any help is appreciated

hadoop size distcp qubole

asked May 16 '19 at 23:41

sneha salvi

votes

1 answer

Big files causing shuffle error in hadoop map reduce

I am seeing the following error when I try to process big file like size > 35GB files, but doesn't happen when I try less big file like size < 10GB . App > Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in…

java hadoop mapreduce qubole

asked Oct 08 '18 at 18:18

Jal

1,694
13
24

votes

1 answer

Get correct value from array in Hive QL

I have a Wrapped Array and want to only get the corresponding value struct when I query with LATERAL VIEW EXPLODE. SAMPLE STRUCTURE: COLUMNNAME: theARRAY WrappedArray([null,theVal,valTags,[123,null,null,null,null,null],false],…

sql hive apache-spark-sql hiveql qubole

asked Sep 26 '18 at 00:49

noobeerp

votes

0 answers

Convert column in presto from epoch to date

I tried this but that didn't work. cast(from_unixtime('1532568232662880')) as date Any other ideas?

epoch presto qubole

asked Aug 30 '18 at 18:59

nak5120

3,410
3
23
62

votes

1 answer

Amazon s3Exception bad request and location constraint in hadoop s3a

Does location constraint require extra permission policy for hadoop s3a? I am seeing Exception in thread "main" com.qubole.com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad…

amazon-web-services hadoop amazon-s3 qubole

asked Aug 29 '18 at 21:38

Jal

1,694
13
24

votes

1 answer

How do I get the value without the square brackets

I have created a dataframe using Scala and Spark SQL. I wanted the first value from the table but I am getting it inside of square brackets []. Can I just get the value without the brackets? Code: val sigh = sqlContext.sql("""SELECT DISTINCT…

scala apache-spark apache-spark-sql user-defined-functions qubole

asked Aug 22 '18 at 00:20

noobeerp

votes

0 answers

select a table from a database in R

I am using dbplyr to select a table from a remote database using Rstudio. I connected with Spark in the server using livy. It shows me the databases I have but when I try to access one of the tables in one of the schemas, it…

r apache-spark sparklyr dbplyr qubole

asked Jul 17 '18 at 16:12

Fisseha Berhane

1,905
1
20
41

votes

1 answer

Get Not Null Values in Wrapped Array

I have a Wrapped Array and want to only get the Non Null values when I query with LATERAL VIEW EXPLODE. I also tried IS NOT NULL but that does not return anything. SAMPLE STRUCTURE: COLUMNNAME:…

apache-spark hive apache-spark-sql bigdata qubole

asked Jul 05 '18 at 03:41

noobeerp

votes

1 answer

Set partition location in Qubole metastore using Spark

How to set partition location for my Hive table in Qubole metastore? I know that this is MySQL DB, but how to access to it and pass a SQL script with a fix using Spark? UPD: The issue is that ALTER TABLE table_name [PARTITION (partition_spec)] SET…

apache-spark hadoop hive qubole

asked Apr 11 '18 at 12:20

Vova Lis

votes

0 answers

Container Packing in YARN

Qubole has implemented Container Packing in YARN for cloud deployments to reduce infrastructure cost, is there any similar implementation available in open source world?

hadoop yarn hadoop2 qubole

asked Jan 23 '18 at 05:04

banjara

3,561
2
33
53

votes

1 answer

Qubole: How can I download scheduler result in python?

Like title, I managed myself download the Qubole result using the query id in python, however, is there a method that I can download the result using scheduler job ID instead of query ID? Thanks.

python qubole

asked Dec 29 '17 at 08:00

atsang01

votes

1 answer

unable to connect ms sql server from Presto in Qubole

I am using Qubole Data Service on Microsoft Azure. I have created Presto Cluster in Qubole. I want to connect to MS SQL Server from Presto to read data from MS SQL Server. I have created sqlserver directory on…

sql-server azure presto qubole

asked Dec 15 '17 at 13:59

Heta Desai

votes

1 answer

Comparing one day worth of data from S3 buckets faster

Consider 2 data flows below 1. Front End Box ----> S3 Bucket-1 2. Front End Box ----> Kafka --> Storm ---> S3 Bucket-2 The logs from the boxes are being transferred to S3 buckets. The requirement is to replace flow 1 by flow 2. Now the data…

python validation amazon-s3 hive qubole

asked Apr 25 '17 at 21:02

Albatross

votes

1 answer

How to query data from gz file of Amazon S3 using Qubole Hive query？

I need get specific data from gz. how to write the sql? can I just sql as table database?: Select * from gz_File_Name where key = 'keyname' limit 10. but it always turn back with an error.

amazon-s3 hive gzip qubole

asked Mar 22 '17 at 05:28

daxue

Prev 1 2 3 4

6 Next