Log Analysis with Spark

    Log analysis is an ideal use case for Spark. It’s a very large, common data source and contains a rich set of information. Spark allows you to store your logs in files to disk cheaply, while still providing a
    quick and simple way to process them. We hope this project will show you how to use Apache Spark on your organization’s production logs and fully harness the power of that data. Log data can be used for monitoring your servers, improving business and customer intelligence, building recommendation systems, preventing fraud, and much more.

    The Apache Spark library is introduced, as well as Spark SQL and Spark Streaming. By the
    end of this chapter, a reader will know how to call transformations and actions and work
    with RDDs and DStreams.

    This section includes examples to illustrate how to get data out of Spark. Again, concepts of a distributed
    computing environment are reinforced, and the examples are suitable for large datasets.

    More to come…

    While that’s all for now, there’s definitely more to come over time.