master = yarn-client
spark.driver.memory = 10g
spark.executor.memory = 10g
spark.cores.max = 4
It takes 1 minute to read 1GB approximately. What can I do more for reading big data more efficiently?
Thank you.
For performance issue, the best is to know where is the performance bottleneck. Or try to see where the performance problem could be.
Since 1 minute to read 1GB is pretty slow. I would try the following steps.
Imagine that 85% of performers rate a task as very important to the job role. If managers rate the task similarly, there is agreement about its importance. On the other hand, if managers rate it lower, then there is a disconnect. Similarly, if high performers think something is important, but lower performers do not, you now have an important insight into the different mindsets of the two groups.