QBoard » Big Data » Big Data - Spark
  • Vaibhav Mali
    Let's assume for the following that only one Spark job is running at every point in time.
    What I get so far
    Here is what I understand what happens in...  more
    Last post by Advika Banerjee - February 11, 2022
    546 views 0 likes
    3
  • Maryam Bains
    Assume df1 and df2 are two DataFrames in Apache Spark, computed using two different mechanisms, e.g., Spark SQL vs. the Scala/Java/Python API.Is there an idiomatic way to...  more
    Last post by Vaibhav Mali - February 2, 2022
    213 views 0 likes
    3
  • Samar Patil
    I prefer Python over Scala. But, as Spark is natively written in Scala, I was expecting my code to run faster in the Scala than the Python version for obvious reasons.With that...  more
    Last post by Advika Banerjee - January 17, 2022
    163 views 0 likes
    3
  • Rakesh Racharla
    I am using CDH 5.2. I am able to use spark-shell to run the commands.How can I run the file(file.spark) which contain spark commands.Is there any way to run/compile the scala...  more
    Last post by Advika Banerjee - January 17, 2022
    1,159 views 0 likes
    4
  • Jasmine Chacko
    I have 2 DataFrames:
    I need union like this:

    The unionAll function doesn't work because the number and the name of columns are different.How can I do this?
    Last post by Advika Banerjee - January 17, 2022
    223 views 0 likes
    5
  • Viaan Prakash
    I am using spark-csv to load data into a DataFrame. I want to do a simple query and display the content:
    val df =...  more
    Last post by Vaibhav Mali - January 15, 2022
    171 views 0 likes
    2
  • Rakesh Racharla
    I'm new to Spark and I'm trying to read CSV data from a file with Spark. Here's what I am doing :sc.textFile('file.csv') .map(lambda line: (line.split(','), line.split(',')))...  more
    Last post by Maryam Bains - January 11, 2022
    1,373 views 0 likes
    5
  • Maryam Bains
    I am new at this concept, and still learning. I have total 10 TB json files in AWS S3, 4 instances(m3.xlarge) in AWS EC2 (1 master, 3 worker). I am currently using spark with...  more
    Last post by Samar Patil - January 10, 2022
    558 views 0 likes
    4
  • Maryam Bains
    I'm trying to implement a Lambda Architecture using the following tools: Apache Kafka to receive all the datapoints, Spark for batch processing (Big Data), Spark Streaming for...  more
    Last post by Vaibhav Mali - January 6, 2022
    173 views 0 likes
    3
  • Maryam Bains
    I am using https://github.com/databricks/spark-csv , I am trying to write a single CSV, but not able to, it is making a folder.
    Need a Scala function which will take parameter...  more
    Last post by Samar Patil - December 28, 2021
    271 views 0 likes
    2
  • Maryam Bains
    Hi In the University in the data science area we learned that if we wanted to work with small data we should use pandas and if we work with Big Data we schould use spark, in the...  more
    Last post by Samar Patil - December 28, 2021
    183 views 0 likes
    3
  • Samar Patil
    I already have a cluster of 3 machines (ubuntu1,ubuntu2,ubuntu3 by VM virtualbox) running Hadoop 1.0.0. I installed spark on each of these machines. ub1 is my master node and the...  more
    Last post by Advika Banerjee - December 24, 2021
    203 views 0 likes
    3
  • Raji Reddy A
    How can I convert an RDD (org.apache.spark.rdd.RDD) to a Dataframe org.apache.spark.sql.DataFrame. I converted a dataframe to rdd using .rdd. After processing it I want it back in...  more
    Last post by Viaan Prakash - December 22, 2021
    936 views 0 likes
    3
  • Sindhuja Martha
    Having read this question, I would like to ask additional questions:


    The Cluster Manager is a long-running service, on which node it is...  more
    Last post by Maryam Bains - December 20, 2021
    239 views 0 likes
    3
  • Viaan Prakash
    True ... it has been discussed quite a lot.
    However there is a lot of ambiguity and some of the answers provided ... including duplicating jar references in the...  more
    Last post by Maryam Bains - December 20, 2021
    319 views 0 likes
    3

QBoard Statistics

Topics 39
Posts 158
Total Users 7404
Active Users 17