2020了你还不会Java8新特性？（六）Stream源码剖析

时间:2020-01-06 我不是铁杆啊人气:2

Stream流源码详解

节前小插曲

AutoCloseable接口：通过一个例子举例自动关闭流的实现。

public interface BaseStream<T, S extends BaseStream<T, S>>
        extends AutoCloseable{}  // BaseStream 继承了这个接口。 Stream继承了Stream

public class AutoCloseableTest implements AutoCloseable {
    public void dosomething() {
        System.out.println(" do something ");
    }

    @Override
    public void close() throws Exception {
        System.out.println(" close invoked ");
    }

    public static void main(String[] args) throws Exception {
        try ( AutoCloseableTest autoCloseableTest = new AutoCloseableTest()){
            autoCloseableTest.dosomething();
        }
    }
}

运行结果如下：自动调用了关闭流的方法

Stream

/**
 * A sequence of elements supporting sequential and parallel aggregate
 * operations.  The following example illustrates an aggregate operation using
 * {@link Stream} and {@link IntStream}:
 *
 * <pre>{@code   // 举例： 
 *     int sum = widgets.stream()
 *                      .filter(w -> w.getColor() == RED)
 *                      .mapToInt(w -> w.getWeight())
 *                      .sum();
 * }</pre>
 *
 * In this example, {@code widgets} is a {@code Collection<Widget>}.  We create
 * a stream of {@code Widget} objects via {@link Collection#stream Collection.stream()},
 * filter it to produce a stream containing only the red widgets, and then
 * transform it into a stream of {@code int} values representing the weight of
 * each red widget. Then this stream is summed to produce a total weight.
 *
 * <p>In addition to {@code Stream}, which is a stream of object references,
 * there are primitive specializations for {@link IntStream}, {@link LongStream},
 * and {@link DoubleStream}, all of which are referred to as "streams" and
 * conform to the characteristics and restrictions described here.
 jdk提供了平行的 特化的流。
 *
 * <p>To perform a computation, stream
 * <a href="package-summary.html#StreamOps">operations</a> are composed into a
 * <em>stream pipeline</em>.  A stream pipeline consists of a source (which
 * might be an array, a collection, a generator function, an I/O channel,
 * etc), zero or more <em>intermediate operations</em> (which transform a
 * stream into another stream, such as {@link Stream#filter(Predicate)}), and a
 * <em>terminal operation</em> (which produces a result or side-effect, such
 * as {@link Stream#count()} or {@link Stream#forEach(Consumer)}).
 * Streams are lazy; computation on the source data is only performed when the
 * terminal operation is initiated, and source elements are consumed only
 * as needed.
 
 
 为了执行计算，流会被执行到一个流管道当中。
 一个流管道包含了：
 一个源。（数字来的地方）
 0个或多个中间操作（将一个stream转换成另外一个Stream）。
 一个终止操作（会生成一个结果，或者是一个副作用（求和，遍历））。
 
 流是延迟的，只有当终止操作被发起的时候，才会执行中间操作。
 
 * <p>Collections and streams, while bearing some superficial similarities,
 * have different goals.  Collections are primarily concerned with the efficient
 * management of, and access to, their elements.  By contrast, streams do not
 * provide a means to directly access or manipulate their elements, and are
 * instead concerned with declaratively describing their source and the
 * computational operations which will be performed in aggregate on that source.
 * However, if the provided stream operations do not offer the desired
 * functionality, the {@link #iterator()} and {@link #spliterator()} operations
 * can be used to perform a controlled traversal.
 
 集合和流虽然有一些相似性，但是他们的差异是不同的。
 集合是为了高效对于元素的管理和访问。流并不会提供方式去直接操作流里的元素。（集合关注的是数据的管理，流关注的是元素内容的计算）
 如果流操作并没有提供我们需要的功能，那么我们可以使用传统的iterator or spliterator去执行操作。 

 * <p>A stream pipeline, like the "widgets" example above, can be viewed as
 * a <em>query</em> on the stream source.  Unless the source was explicitly
 * designed for concurrent modification (such as a {@link ConcurrentHashMap}),
 * unpredictable or erroneous behavior may result from modifying the stream
 * source while it is being queried.
 
 一个流管道，可以看做是对流源的查询，除非这个流被显示的设计成可以并发修改的。否则会抛出异常。
 （如一个线程对流进行修改，另一个对流进行查询）
 
 * <p>Most stream operations accept parameters that describe user-specified
 * behavior, such as the lambda expression {@code w -> w.getWeight()} passed to
 * {@code mapToInt} in the example above.  To preserve correct behavior,
 * these <em>behavioral parameters</em>:
    //为了能满足结果，需满足下边的条件。
 * <ul>
 * <li>must be <a href="package-summary.html#NonInterference">non-interfering</a>
 * (they do not modify the stream source); and</li>
 * <li>in most cases must be <a href="package-summary.html#Statelessness">stateless</a>
 * (their result should not depend on any state that might change during execution
 * of the stream pipeline).</li>
 * </ul>
 
 行为上的参数，大多是无状态的。
 
 * <p>Such parameters are always instances of a
 * <a href="../function/package-summary.html">functional interface</a> such
 * as {@link java.util.function.Function}, and are often lambda expressions or
 * method references.  Unless otherwise specified these parameters must be
 * <em>non-null</em>.
 
    无一例外的。这种参数总是函数式接口的形式。也就是lambda表达式。除非特别指定，这些参数必须是非空的。
    
 * <p>A stream should be operated on (invoking an intermediate or terminal stream
 * operation) only once.  This rules out, for example, "forked" streams, where
 * the same source feeds two or more pipelines, or multiple traversals of the
 * same stream.  A stream implementation may throw {@link IllegalStateException}
 * if it detects that the stream is being reused. However, since some stream
 * operations may return their receiver rather than a new stream object, it may
 * not be possible to detect reuse in all cases.
 
 一个流只能被使用一次。对相同的流进行多次操作，需要创建多个流管道。
 
 * <p>Streams have a {@link #close()} method and implement {@link AutoCloseable},
 * but nearly all stream instances do not actually need to be closed after use.
 * Generally, only streams whose source is an IO channel (such as those returned
 * by {@link Files#lines(Path, Charset)}) will require closing.  Most streams
 * are backed by collections, arrays, or generating functions, which require no
 * special resource management.  (If a stream does require closing, it can be
 * declared as a resource in a {@code try}-with-resources statement.)
 
 流拥有一个closed方法，实现了AutoCloseable，在他的父类里。 最上面以举例实现。
 但是一个流 除了是I/O流（因为持有句柄等资源）才需要被关闭外，是不需要被关闭的。
 大多数的流底层是集合、数组或者是生成器函数。 他们并不需要特别的资源管理。如果需要被关闭，可以用try（）操作。

 * <p>Stream pipelines may execute either sequentially or in
 * <a href="package-summary.html#Parallelism">parallel</a>.  This
 * execution mode is a property of the stream.  Streams are created
 * with an initial choice of sequential or parallel execution.  (For example,
 * {@link Collection#stream() Collection.stream()} creates a sequential stream,
 * and {@link Collection#parallelStream() Collection.parallelStream()} creates
 * a parallel one.)  This choice of execution mode may be modified by the
 * {@link #sequential()} or {@link #parallel()} methods, and may be queried with
 * the {@link #isParallel()} method.
 
 流管道可以被串行或者并行操作。这种模式只是一个属性而已。 初始化的时候会进行一个选择。
 比如说  stream() 是串行流。parallelStream()是并行流。 
 还可以通过sequential（）or parallel() 来进行修改。 以最后一个被调用的方法为准。
 也可以用isParallel（）来进行查询流是否是并行流。
 
 * @param <T> the type of the stream elements
 * @since 1.8
 * @see IntStream
 * @see LongStream
 * @see DoubleStream
 * @see <a href="package-summary.html">java.util.stream</a>
 */
public interface Stream<T> extends BaseStream<T, Stream<T>> {
  
  // 具体举例， 源码中有例子
    Stream<T> filter(Predicate<? super T> predicate);    // 过滤
  <R> Stream<R> map(Function<? super T, ? extends R> mapper);  //映射
  IntStream mapToInt(ToIntFunction<? super T> mapper);
  LongStream mapToLong(ToLongFunction<? super T> mapper);
  DoubleStream mapToDouble(ToDoubleFunction<? super T> mapper); 
  <R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper); //压平
  IntStream flatMapToInt(Function<? super T, ? extends IntStream> mapper);
  LongStream flatMapToLong(Function<? super T, ? extends LongStream> mapper);
  DoubleStream flatMapToDouble(Function<? super T, ? extends DoubleStream> mapper);、
  Stream<T> distinct();// 去重
  Stream<T> sorted(); //排序
  Stream<T> sorted(Comparator<? super T> comparator);
  Stream<T> peek(Consumer<? super T> action);  
  Stream<T> limit(long maxSize);  // 截断
  void forEach(Consumer<? super T> action); // 遍历
  void forEachOrdered(Consumer<? super T> action); // 遍历时执行操作
  Object[] toArray();  // 转数组
  T reduce(T identity, BinaryOperator<T> accumulator); //  汇聚， 返回一个汇聚的结果
  <R> R collect(Supplier<R> supplier,
                  BiConsumer<R, ? super T> accumulator,
                  BiConsumer<R, R> combiner);     // 收集器 
  。。。  
}

自行参考父接口中的方法；

Stream中具体方法的详解

分割迭代器：

/**

 * Returns a spliterator for the elements of this stream.
 *
 * <p>This is a <a href="package-summary.html#StreamOps">terminal
 * operation</a>.
 *
 * @return the element spliterator for this stream
 */
Spliterator<T> spliterator();

baseStream 源码讲解

BaseStream 是所有流的父类。

/**
 * Base interface for streams, which are sequences of elements supporting
 * sequential and parallel aggregate operations.  The following example
 * illustrates an aggregate operation using the stream types {@link Stream}
 * and {@link IntStream}, computing the sum of the weights of the red widgets:
 *
 * <pre>{@code
 *     int sum = widgets.stream()
 *                      .filter(w -> w.getColor() == RED)
 *                      .mapToInt(w -> w.getWeight())
 *                      .sum();
 * }</pre>
 *
 * See the class documentation for {@link Stream} and the package documentation
 * for <a href="package-summary.html">java.util.stream</a> for additional
 * specification of streams, stream operations, stream pipelines, and
 * parallelism, which governs the behavior of all stream types.
 *
 * @param <T> the type of the stream elements
 * @param <S> the type of of the stream implementing {@code BaseStream}
 * @since 1.8
 * @see Stream
 * @see IntStream
 * @see LongStream
 * @see DoubleStream
 * @see <a href="package-summary.html">java.util.stream</a>
 */
public interface BaseStream<T, S extends BaseStream<T, S>> extends AutoCloseable 

public interface Stream<T> extends BaseStream<T, Stream<T>>

BaseStream(){

 Iterator<T> iterator(); 迭代器
 Spliterator<T> spliterator();  分割迭代器  。 这是一个流的终止操作。
 boolean isParallel();  是否是并行。   
 S sequential();  // 返回一个等价的串行流。   返回S是一个新的流对象
 S parallel();   //返回一个并行流。 
 S unordered();   // 返回一个无序的流。
 S onClose(Runnable closeHandler);   //当前流.onClose、 当close调用时，调用此方法。
 void close();      // 关闭流
 
 }

关闭处理器的举例

/**
     * Returns an equivalent stream with an additional close handler.  Close
     * handlers are run when the {@link #close()} method
     * is called on the stream, and are executed in the order they were
     * added.  All close handlers are run, even if earlier close handlers throw
     * exceptions.  If any close handler throws an exception, the first
     * exception thrown will be relayed to the caller of {@code close()}, with
     * any remaining exceptions added to that exception as suppressed exceptions
     * (unless one of the remaining exceptions is the same exception as the
     * first exception, since an exception cannot suppress itself.)  May
     * return itself.
     *
     * <p>This is an <a href="package-summary.html#StreamOps">intermediate
     * operation</a>.
     *
     * @param closeHandler A task to execute when the stream is closed
     * @return a stream with a handler that is run if the stream is closed
     */
    S onClose(Runnable closeHandler);

 public static void main(String[] args) {

        List<String> list = Arrays.asList("hello","world");
        NullPointerException nullPointerException = new NullPointerException("myexception");
        try (Stream<String> stream = list.stream()){
            stream.onClose(()->{
                System.out.println("aaa");
//                throw new NullPointerException("first");
                throw nullPointerException;
            }).onClose(()->{
                System.out.println("aaa");
                throw nullPointerException;
            }).forEach(System.out::println);
        }
        // 出现异常会被压制，
        // 如果是同一个异常对象，只会打印一次异常。 如果是多个异常对象。都会被打印。
    }

javadoc 中的介绍比任何资料都详细。

Stream 源码分析。

stream();

/**
     * Returns a sequential {@code Stream} with this collection as its source.
     
     返回一个串行流，把这个集合当做源
     
     * <p>This method should be overridden when the {@link #spliterator()}
     * method cannot return a spliterator that is {@code IMMUTABLE},
     * {@code CONCURRENT}, or <em>late-binding</em>. (See {@link #spliterator()}
     * for details.)

     当不能返回  三种方法 中的一个时，这个方法应该被重写。
     
     * @implSpec
     * The default implementation creates a sequential {@code Stream} from the
     * collection's {@code Spliterator}.
     
     默认会从集合中创建一个串行流。 返回
     
     * @return a sequential {@code Stream} over the elements in this collection
     * @since 1.8
     */
    default Stream<E> stream() {
        return StreamSupport.stream(spliterator(), false);
    }

spliterator(); 分割迭代器

 /**
     * Creates a {@link Spliterator} over the elements in this collection.
     *
     * Implementations should document characteristic values reported by the
     * spliterator.  Such characteristic values are not required to be reported
     * if the spliterator reports {@link Spliterator#SIZED} and this collection
     * contains no elements.
     
     * <p>The default implementation should be overridden by subclasses that
     * can return a more efficient spliterator.  In order to
     * preserve expected laziness behavior for the {@link #stream()} and
     * {@link #parallelStream()}} methods, spliterators should either have the
     * characteristic of {@code IMMUTABLE} or {@code CONCURRENT}, or be
     * <em><a href="Spliterator.html#binding">late-binding</a></em>.
     
     默认的子类应该被重写。为了保留parallelStream  和 stream的延迟行为。特性需要满足IMMUTABLE 或者CONCURRENT
     
     * If none of these is practical, the overriding class should describe the
     * spliterator's documented policy of binding and structural interference,
     * and should override the {@link #stream()} and {@link #parallelStream()}
     * methods to create streams using a {@code Supplier} of the spliterator,
     * as in:
     * <pre>{@code
     *     Stream<E> s = StreamSupport.stream(() -> spliterator(), spliteratorCharacteristics)
     * }</pre>
     
     为什么叫分割迭代器。先分割，在迭代。
     如果不能满足上述的要求，则重写的时候应该满足上述的需求、
     
     * <p>These requirements ensure that streams produced by the
     * {@link #stream()} and {@link #parallelStream()} methods will reflect the
     * contents of the collection as of initiation of the terminal stream
     * operation.
     
     这些确保了流会返回的内容。
     
     * @implSpec
     * The default implementation creates a
     * <em><a href="Spliterator.html#binding">late-binding</a></em> spliterator
     * from the collections's {@code Iterator}.  The spliterator inherits the
     * <em>fail-fast</em> properties of the collection's iterator.
     * <p>
     * The created {@code Spliterator} reports {@link Spliterator#SIZED}.
    
     默认会从集合的迭代器中创建出一个延迟的分割迭代器。 默认的迭代器 会有默认大小的迭代器。
     
     * @implNote
     * The created {@code Spliterator} additionally reports
     * {@link Spliterator#SUBSIZED}.
     *
     * <p>If a spliterator covers no elements then the reporting of additional
     * characteristic values, beyond that of {@code SIZED} and {@code SUBSIZED},
     * does not aid clients to control, specialize or simplify computation.
     * However, this does enable shared use of an immutable and empty
     * spliterator instance (see {@link Spliterators#emptySpliterator()}) for
     * empty collections, and enables clients to determine if such a spliterator
     * covers no elements.
     
     如果分割迭代器不包含任何元素。 其他的属性对客户端是没有任何帮助的。 然而会促进分割迭代器共享的作用。
     
     * @return a {@code Spliterator} over the elements in this collection
     * @since 1.8
     */
    @Override
    default Spliterator<E> spliterator() {
        return Spliterators.spliterator(this, 0);
    }

加载全部内容