Spark

介绍

  1. 由Scala写成

原理

  1. 迭代数据可以保存在内存

体系

Spark Streaming(批处理)

实例

Word Count

JavaRDD<String> textFile = sc.textFile("hdfs://...");
JavaPairRDD<String, Integer> counts = textFile
    .flatMap(s -> Arrays.asList(s.split(" ")).iterator())
    .mapToPair(word -> new Tuple2<>(word, 1))
    .reduceByKey((a, b) -> a + b);
counts.saveAsTextFile("hdfs://...");

http://spark.apache.org/examples.html

资料

wangyaqi.cn all right reserved,powered by Gitbook该文件修订时间: 2020-04-18 15:35:02

results matching ""

    No results matching ""

    results matching ""

      No results matching ""