1 Actors

import akka.actor.{ ActorRef, ActorSystem, Props, Actor, Inbox }
import scala.concurrent.duration._

// Three messages, one is object, two are case classes
case object Greet
case class WhoToGreet(who: String)
case class Greeting(message: String)

class Greeter extends Actor {
  var greeting = ""

  def receive = {
    case WhoToGreet(who) => greeting = s"hello, $who"
    case Greet           => sender ! Greeting(greeting)
    // Send the current greeting back to the sender
  }
}

object HelloAkkaScala extends App {

  // Create the 'helloakka' actor system
  val system = ActorSystem("helloakka")

  // Create the 'greeter' actor
  val greeter = system.actorOf(Props[Greeter], "greeter")

  // Create an "actor-in-a-box"
  val inbox = Inbox.create(system)

  // Tell the 'greeter' to change its 'greeting' message
  greeter.tell(WhoToGreet("akka"), ActorRef.noSender)
  // Ask the 'greeter for the latest 'greeting'
  // Reply should go to the "actor-in-a-box"
  inbox.send(greeter, Greet)

  // Wait 5 seconds for the reply with the 'greeting' message
  val Greeting(message1) = inbox.receive(5.seconds)
  println(s"Greeting: $message1")

  // Change the greeting and ask for it again
  greeter.tell(WhoToGreet("typesafe"), ActorRef.noSender)
  inbox.send(greeter, Greet)
  val Greeting(message2) = inbox.receive(5.seconds)
  println(s"Greeting: $message2")

  val greetPrinter = system.actorOf(Props[GreetPrinter])
  // after zero seconds, send a Greet message every second to the greeter with a sender of the greetPrinter
  system.scheduler.schedule(0.seconds, 1.second, greeter, Greet)(system.dispatcher, greetPrinter)

}

// prints a greeting
class GreetPrinter extends Actor {
  def receive = {
    case Greeting(message) => println(message)
  }
}
  • Sending messages
    1. actor ! message or actor.tell(message, sender).
    2. inbox.send(sender, message)
    3. system.scheduler.schedule(initDelay, interval, receiver, message)(executor, sender)
    4. actor ? message, it means “ask”, which resturns a Future representing a possible reply. When you implement ask function, you have to catch the exception and send a Failure to the sender
      try {
        val result = operation()
        sender() ! result
      } catch {
        case e: Exception =>
          sender() ! akka.actor.Status.Failure(e)
          throw e
      }
      

      You can set a timeout period for the Future by implicit value implicit val timeout = Timeout(5 seconds) or using curry function myActor.ask(msg)(5 seconds).

  • Receiving messages
    1. receive defined method in actor will be called when message comes. Attention:, the sender() method returns the current sender when the method is called, DO NOT use this function in callbacks since the sender method may returns wrong sender (deadletter in many cases) when the callback is actually invoked.
    2. inbox.receive
    3. Set timeout period for receiving method, context.setReceiveTimeout(30 milliseconds). Then you can catch this event by receiving message case ReceiveTimeout.
  • Forward message target forward message, it will keep the sender unchanged. A send msg to B, B forward msg to C, the sender for C is still A instead of B
  • Stop actor akka.actor.PoisonPill message will stop the actor. gracefulStop is useful to wait for termination or compose ordered termination of several actors. It is a pattern, not precoded, it sends a user defined message to the actor and returns a Future, you can Await this future for a duration.

    actor ! Kill will also stop this actor, but it causes the actor to throw a ActorKilledException. This actor will suspend and like the supervisor to decide how to handle the failure.

  • Become/Unbecome context.become method in Actor can setting the receive function, hence the behavior of Actor is changed. unbecome lets the actor returns to previews behavior.

    You can use stash and unstash to save mails into a queue in one behavior and get them back in another behavior. Of course, you have to mix the trait Stash to be able to use it.

  • Extend actor and intialization patterns Use PartialFunction#orElse to chain the receive function of the extended actor.

    Three init patterns

    1. constructor
    2. preStart
    3. Message

1.1 Falt toleration

Example code from document.

  1. Counter, increment and save count to storage, return current count number
  2. Storage, simulate database, store (String, Long) pairs.
  3. CounterService, override supervisorStrategy by OneForOneStrategy
    override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 3,
      withinTimeRange = 5.seconds) {
      case _: Storage.StorageException => Restart
    }
    

    init with actor refererences to the wrapper (watcher) of storage, tell counter (when initialized?) to use the storage.

    forward message (increment, getCurrentCount) to counter if it exists; save sender->msg pair to log if counter is not available.

    if storage is terminated, tell counter that storage is not available, schedule task to initStorage again (reconnect).

  4. Worker, stop when receiving message from CounterService. init CounterService actor.
  5. Listener, shutdown the system while timeout. Listen Progress, which is a case class of Worker.
  6. App object init worker and listener. Tell worker to assign the listener, which will schedule task to use the CounterService to increment count and pipe the message to listener.

1.2 Persistent Actor

2 Cluster

2.1 Simple Cluster

2.1.1 Configuration

To run cluster, you have to first creat a configuration file that specify the URI, actors, etc.

akka {
  actor {
    provider = "akka.cluster.ClusterActorRefProvider"
  }
  remote {
    log-remote-lifecycle-events = off
    netty.tcp {
      hostname = "127.0.0.1"
      port = 0
    }
  }

  cluster {
    seed-nodes = [
      "akka.tcp://ClusterSystem@127.0.0.1:2551",
      "akka.tcp://ClusterSystem@127.0.0.1:2552"]

    auto-down-unreachable-after = 10s
  }
}

2.1.2 Code

package sample.cluster.simple

import com.typesafe.config.ConfigFactory
import akka.actor.ActorSystem
import akka.actor.Props

object SimpleClusterApp {
  def main(args: Array[String]): Unit = {
    if (args.isEmpty)
      startup(Seq("2551", "2552", "0"))
    else
      startup(args)
  }

  def startup(ports: Seq[String]): Unit = {
    ports foreach { port =>
      // Override the configuration of the port
      val config = ConfigFactory.parseString("akka.remote.netty.tcp.port=" + port).
        withFallback(ConfigFactory.load())

      // Create an Akka system
      val system = ActorSystem("ClusterSystem", config)
      // One system on one port?

      // Create an actor that handles cluster domain events
      system.actorOf(Props[SimpleClusterListener], name = "clusterListener")
    }
  }

}

class SimpleClusterListener extends Actor with ActorLogging {

  val cluster = Cluster(context.system)

  // subscribe to cluster changes, re-subscribe when restart
  override def preStart(): Unit = {
    //#subscribe
    cluster.subscribe(self, initialStateMode = InitialStateAsEvents,
      classOf[MemberEvent], classOf[UnreachableMember])
    //#subscribe
  }
  override def postStop(): Unit = cluster.unsubscribe(self)

  def receive = {
    case MemberUp(member) =>
      log.info("Member is Up: {}", member.address)
    case UnreachableMember(member) =>
      log.info("Member detected as unreachable: {}", member)
    case MemberRemoved(member, previousStatus) =>
      log.info("Member is Removed: {} after {}",
        member.address, previousStatus)
    case _: MemberEvent => // ignore
  }
}

2.1.3 Running

Log when starting the second seed node: [info] [INFO] [03/10/2015 23:57:27.799] [ClusterSystem-akka.actor.default-dispatcher-14] [Cluster(akka://ClusterSystem)] Cluster Node [akka.tcp://ClusterSystem@127.0.0.1:2551] - Node [akka.tcp://ClusterSystem@127.0.0.1:2552] is JOINING, roles []

This code defines a cluster (集群), each node runs as dependennt processes (different JVMs) since they all run on my local machine. A cluster is indentified by akka://CllusterSystem, while each node is indentified by IP address and port number. The IP address seems like to be defined by application.config.

  • seed nodes are configured contact points for initial, automatic, join of the cluster. Hence, the seed-nodes figure out where to join the cluster to current node. The seed nodes can also be configured by command line options when starting the JVM using the following syntax:
    -Dakka.cluster.seed-nodes.0=akka.tcp://ClusterSystem@host1:2552
    -Dakka.cluster.seed-nodes.1=akka.tcp://ClusterSystem@host1:2552
    

    Starting seed notes can be async and not all the seed nodes are necessary to run. But the first node must be started when starting a cluster.

    It is also possible to join the cluster manually via JMX or Command Line Management or via programatic method with Cluster.get(system).join(address) or Cluster(system).joinSeedNodes.

  • hostname in remote in configuration specify the IP of current node. Question: /what if I specify the IP to another machine? The application should be deploy to another machine before running? What protocole does it uses?/

2.2 Dial-in example

Frontend-backend pattern (not those in Group router):

  1. Backed subscribes to the cluster service to register new frontend
  2. Frontend watch the backend and define the receive method for the case of Terminated(a).
  3. Frontend redirect received jobs to backend.
  4. Roles of node is defined in the configuration property named akka.cluster.roles or in start script as a system property or environment variable.

Number of members restriction: akka.cluster.min-nr-of-members = 3, the leader will not change the status of nodes to ‘Up’ until this number of members reached. Finer configuration:

#+beign_src yaml akka.cluster.role { frontend.min-nr-of-members = 1 backend.min-nr-of-members = 2 }

#+end_src Define callback to be invoked when the current member status is changed to ‘Up’

Cluster(system) registerOnMemberUp {
  system.actorOf(Props(classOf[FactorialFrontend], upToN, true),
    name = "factorialFrontend")
}

2.3 Cluster Aware Routers

Router is one that receive and send tasks to actors, while the ref to actors are called routees.

2.3.1 Group router

在示例代码中,每一个node上面都有一个StatsService,和一个 StatsWorker. StatsService里面包含了指向StatsWorker的router并且负 责接受和转发任务. 每次Client将一个任务传给Cluster时,Cluster中某 个一node上的StatsService会接收并分发任务,因为分发任务是依靠 router来实现的,所以任务会被分发到不同node上的StatsWorker上去.这 个例子中,每个node上的Service和Worker都是显式创建的.

Group example: Server side

akka.actor.deployment {
  /statsService/workerRouter {
      router = consistent-hashing-group
      // specify the router, you have also other options like round-robin-pool
      nr-of-instances = 100
      routees.paths = ["/user/statsWorker"]
      cluster {
        enabled = on
        allow-local-routees = on
        use-role = compute
      }
    }
}

With this configuration, relative actor paths defined in routees.paths are used for selecting actors to send the message by the router. 在创建System的时候将这个配置文件包含到配置中去, 然后在这个system 下面任意地方用 actorOf(FromConfig.props(Props[YourClass])) 来 得到router actor. 发送任务时将任务发给router actor,它会自动转发 给cluster中的注册在 “user” 下面的注册名为 “statsWorker” 的actor (不一定是StatsWorker,你可以随意将不同类型的Actor注册在这个名字下 面). 更多的配置,查看routing.

下面的代码是用编程的方式实现配置文件里的配置.

val workerRouter = context.actorOf(
    ClusterRouterGroup(ConsistentHashingGroup(Nil), ClusterRouterGroupSettings(
      totalInstances = 100, routeesPaths = List("/user/statsWorker"),
      allowLocalRoutees = true, useRole = Some("compute"))).props(),
    name = "workerRouter2")

Group example: Client side

object StatsSampleClient {
  def main(args: Array[String]): Unit = {
    // note that client is not a compute node, role not defined
    val system = ActorSystem("ClusterSystem")
    system.actorOf(Props(classOf[StatsSampleClient], "/user/statsService"), "client")
  }
}

class StatsSampleClient(servicePath: String) extends Actor {
  // servicePath is "/user/statsService", which is the second argument of the Props.
  val cluster = Cluster(context.system)
  val servicePathElements = servicePath match {
    case RelativeActorPath(elements) => {
//      println("servicePath is: "+servicePath) ;
      elements}
    case _ => throw new IllegalArgumentException(
      "servicePath [%s] is not a valid relative actor path" format servicePath)
  }
  import context.dispatcher
  val tickTask = context.system.scheduler.schedule(2.seconds, 10.seconds, self, "tick")

  var nodes = Set.empty[Address]

  override def preStart(): Unit = {
    cluster.subscribe(self, classOf[MemberEvent], classOf[ReachabilityEvent])
  }
  override def postStop(): Unit = {
    cluster.unsubscribe(self)
    tickTask.cancel()
  }

  def receive = {
    case "tick" if nodes.nonEmpty =>
      // just pick any one
      val address = nodes.toIndexedSeq(ThreadLocalRandom.current.nextInt(nodes.size))
      val service = context.actorSelection(RootActorPath(address) / servicePathElements)
      service ! StatsJob("this is the text that will be analyzed")
    case result: StatsResult =>
      println(result)
    case failed: JobFailed =>
      println(failed)
    case state: CurrentClusterState =>
      nodes = state.members.collect {
        case m if m.hasRole("compute") && m.status == MemberStatus.Up => m.address
      }
    case MemberUp(m) if m.hasRole("compute")        => nodes += m.address
    case other: MemberEvent                         => nodes -= other.member.address
    case UnreachableMember(m)                       => nodes -= m.address
    case ReachableMember(m) if m.hasRole("compute") => nodes += m.address
  }

}

而在客户端代码中,首先向cluster订阅成员事件来获取所有的成员. 然后 根据成员的role来筛选出相应的node. 真正开始发送任务的时候,随机排 列这些node,然后使用 context.actorSelection(RootActorPath(address) / servicePathElements) 这句话返回一个 ActorSelection 的对象,对它 发送信息能自动地发送到某个注册在cluster上的某个node上的某个actor 上面去.准确地说,后半段确定Actor的URN,前半段应该是URL.

2.3.2 Pool router

利用ClusterSingletonManager作为StatsService的外包,虽然每个node上 都运行了创建代码,但实际上只有一个StatsService被创建,从而实现单例. 在node被创建之后,node沟通其它node并加入cluster,随后连接上singleton 下的statsService,如果是第一个node则会创建router,然后创建3个 worker. 之后的node直接在连上statsService之后创建3个worker. worer是 由workerRouter创建的,3个worker的设定在配置文件里 max-nr-of-instances-per-node = 3. 这个设定只对pool有效,group里 面就算设定了也不会创建worker.

singlton意味着Actor在整个集群上只有一个,目前看来是先创建的node 上面载有这个单例,如果这个node挂了怎么办?

singletonManager 被映射到 /user/singleton 上作为 StatsService 的 容器, StatsService 自然就映射到其下 /user/singleton/statsService.

2.4 Adaptive Load Balancing

AdaptiveLoadBalanceing Pool 和 AdaptiveLoadBalancing Group 两种 实现方式.中心是cluster metrics和配置adaptive的router.

  1. 注意,不要在 preStart 里面发送信息,这个时候路由还没有设置好,发 送的message全部/部分都会变成 dead letter.
  2. 另外一个坑是:router会按照算法把负载分布到cluster的每个node上去.在 例子中,用role来区分前后端的node,所以router只会向后端node发送信 息.但是后端node不代表有后端actor,这一点router是无法探知的.所以 如果你把只运行有前端actor的node也标记为后端,router仍然会向其发 送信息,最后会导致信息送丢了. node的身份取只决于system config里 面的role,不取决于里面到底运行了什么actor.
  3. 追加一个坑:不要在callback里面调用任何状态相关的函数,除非你确定 你需要这个副作用.在callback里面调用 sender() 经常会返回 deadLetter, 因为 sender() 真正被调用的时候不是接收到信息的 时候.

1 Project file structure

Generally, we use maven to manage the directory layout. Hence, you will get a standard directory layout like below:

project
    src
        main
            java
            [scala]
            resources
                environment.properties
                environment.test.properties
                environment.prod.properties
                config
                    application.yml
                    application-dev.yml
                    application-[profile].yml
                [other]
        test
            java
            [scala]
    target

2 Management with Maven

The files environment.[profile].properties are used to provide different profiles to maven.

Since maven is used to manage the package dependencies, different maven profiles enables you to use different dependencies for development and production, e.g. run different tasks. I am not farmiliar with maven, and cannot tell how and when it can be used.

You have to add spring-boot-maven-plugin as one of the plugins under build node in pom.xml.

  • Development
    • Compile the project by maven clean insatll
    • More than Build

      Maven can help running test, web application and produceing reports by plugins. TODO

    • Compile and run: mvn spring-boot:run to run the application with default profile; mvn -Pprod spring-boot:run to run the application with prod profile.
  • Production
    • Package the files
      • Jar file. Run mvn clean package to build the jar file and run it through command line.
      • War file. Add <packaging>war</packaging> in pom.xml file, under the root node. Then run mvn package as building the jar file.
    • Deploy

From command line, you can run the jar file with specification of profile.

java -jar -Dspring.profiles.active=production YOUR_JAR.jar

3 Application Configuration

The files application-[profile].yml are used to provide different configurations for spring.

Spring configuration files let you to inject different configurations for development and production. For example, you can use different database for development and production (different host, port, username and password, etc.).

3.1 Inject configurations

You can get access to these configurations through many ways

  • Inject the value of configuration to attribute of this class by @Value("${NAME:DEFAULT_VALUE}").
  • You can also inject Environment object.
    // This line is not about get the property, but about define
    // property and let it accessible by somewhere else.
    @Configuration
    public class AppConfig {
        @Inject Environment env;
    
        @Bean
        public MyBean myBean() {
            MyBean myBean = new MyBean();
            myBean.setName(env.getProperty("bean.name"));
            return myBean;
        }
    }
    
  • You can also implement the EnvironmentAware interface to get an environment object to get the property.
    public class MailConfiguration implements EnvironmentAware {
        public void setEnvironment(Environment environment) {
            this.propertyResolver = new RelaxedPropertyResolver(environment, ENV_SPRING_MAIL);
        }
    }
    

One difference between Environment and @Value is that we can access the information related to profiles. The environment can determine which profiles are currently active, and which profile should be active by default.

env.acceptsProfiles(String... profiles)
env.getActiveProfiles() // return an array of strings

Environment interface extends PropertyResolver, which is used to acess the properties. You can get property directly from the environment object. The example above provide a way that get properties by key name following a relaxed rule.

3.2 Load configuration files.

Different property sources are loaded in predefined order. You can set the default additional property file by

SimpleCommandLinePropertySource source = new SimpleCommandLinePropertySource(args);
if (!source.containsProperty("spring.profiles.active")){
    app.setAdditionalProfiles([profile_name])
}

3.3 Define configuration source class

A class is recognized as a source of configuration if it has the @Configuration class. @Bean defined in this class can be accessed from AnnotationConfigApplicationContext. It is a replacement of XML configuration files that defines beans. Get the configuration from configuration class in some where else.

public class ConfigurationTest {
    @Test
    public void testHelloworld () {
        AnnotationConfigApplicationContext ctx = new AnnotationConfigApplicationContext(AppConfig.class);
        Assert.assertEquals("hello", ctx.getBean("message")); // message is the name of the Bean method in configuration class
        // it can also be used to get any kind of object
    }
}

AppConfig is the class that used as configuration source.

3.4 Automatically assign properties.

In configuration source, add @EnableConfigurationProperties(A.class). In target class A, add @ConfigurationProperties(prefix = "PREFIX"). Then the properties under PREFIX will be automatically assigned to the attributes of class A. DETAILS OF THE CONFIGURATION SOURCE???

@EnableAutoConfiguration auto configure the application. I don’t know the details of this annotation, but it auto wires the configurations into the embedded tomcat. Besides, it also configures the database (host and port) automatically when database template is injected in the project.

4 Authentication & Securing

It is mainly implemented by HttpSecurity, which can be injected and configured in method of class that extends WebSecurityConfigurerAdapter.

public class SecurityConfig extends WebSecurityConfigurerAdapter {
    @Inject
    private UserDetailsService userDetailsService;
    // class access to the user database

    @Value("${drill.security.rememberme.key}")
    private String remerberMeKey;
    // key for cookies

    @Inject
    public void configureGlobal(AuthenticationManagerBuilder auth) throws Exception {
        auth
            .userDetailsService(userDetailsService) // the point that connect the authentication module and user database
            .passwordEncoder(passwordEncoder());
    }

    @Override
    public void configure(WebSecurity web) throws Exception {
        // Spring security ignore following URL
        web.ignoring()
            .antMatchers("/bower_components/**")
            .antMatchers("/fonts/**")
            .antMatchers("/images/**")
            ...;
    }

    @Override
    protected void configure(HttpSecurity http) throws Exception {
        http.authorizeRequests()
                .antMatchers("/app/**").authenticated()
                .antMatchers("/app/rest/register").permitAll()
                .antMatchers("/app/rest/logs/**").hasAuthority(Authority.ADMIN)
                // set authorities
            .and()
                .formLogin()
                .usernameParameter("username")
                .passwordParameter("password")
                .loginProcessingUrl("/app/authentication")
                .successHandler(ajaxAuthenticationSuccessHandler)
                .failureHandler(ajaxAuthenticationFailureHandler)
                .permitAll()
                // set login service url
            .and()
                .logout()
                .logoutUrl("/app/logout")
                .logoutSuccessHandler(ajaxLogoutSuccessHandler)
                .deleteCookies("JSESSIONID")
                .permitAll()
            .and()
                .exceptionHandling()
                .authenticationEntryPoint(authenticationEntryPoint)
            .and()
                .rememberMe()
                .rememberMeServices(rememberMeServices)
                // it is not persistent token
                .key(remerberMeKey)
            .and()
                .csrf().disable()  // not clear
                .headers().cacheControl(); // not clear
    }
}

The foundation of user securing is about session and cookies. Once the user logged in, the server will set the cookies in browser. Then browser will send the cookies via HTTP protocol everytime it send new request to the server. On the server side, session is also kept with a unique random id to identify the client. Look another blog Init the project to see how to replace the in memory user store by a repository that connects to database. On the server side, programmer can get the user information by:

public static String getCurrentLogin() {
    SecurityContext securityContext = SecurityContextHolder.getContext();
    Authentication authentication = securityContext.getAuthentication();
    UserDetails springSecurityUser = null;
    String userName = null;

    if(authentication != null) {
        if (authentication.getPrincipal() instanceof UserDetails) {
            springSecurityUser = (UserDetails) authentication.getPrincipal();
            userName = springSecurityUser.getUsername();
        } else if (authentication.getPrincipal() instanceof String) {
            userName = (String) authentication.getPrincipal();
        }
    }

    return userName;
}

2 Securing Web

This example uses WebMvc framework, which uses WebMvcConfigurerAdapter and ViewControllerRegistry to connect the urls and html templates.

Add spring-boot-starter-security into dependency, make WebSecurityConfig.java.

@Configuration
@EnableWebMvcSecurity
public class WebSecurityConfig extends WebSecurityConfigurerAdapter {
    @Override
    protected void configure(HttpSecurity http) throws Exception {
        http
            .authorizeRequests()
                .antMatchers("/", "/home").permitAll()
                .anyRequest().authenticated()
                .and()
            .formLogin()
                .loginPage("/login")
                .permitAll()
                .and()
            .logout()
                .permitAll();
    }

    @Autowired
    public void configureGlobal(AuthenticationManagerBuilder auth) throws Exception {
        auth
            .inMemoryAuthentication()
                .withUser("user").password("password").roles("USER");
        // user repository in memory with username = "user" and password = "password"
        // it is just for demon
    }
}

1 Simple POM

Maven projects are defined with an XML file named pom.xml. This file gives the project’s name, version, and dependencies that it has on external libraries. Here is an example.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>org.springframework</groupId>
    <artifactId>gs-maven</artifactId>
    <packaging>jar</packaging>
    <version>0.1.0</version>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.1</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <transformers>
                                <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>hello.HelloWorld</mainClass>
                                </transformer>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>
  • <modelVerion> POM model version
  • <groupId> Domain name of the group or organization.
  • <artifactId> Name to the project’s library artifact (name of JAR file)
  • <version> Version of the project
  • <packaging> How it should be packaged (in JAR or WAR file)

2 Basic command

mvn compile
mvn package
mvn install

The first command compiles the project, and the .classfiles shold be in the target/classes directory. The second command compile the code, run tests and finish by packaging the code up in a JAR file in the target directory. Tht third command does the same thing as the second command, then copy the JAR file into the local dependency repository, under the directories with name of groupId, artifactId and version. So on my machine, its location is ~/.m2/repository/org/springframework/gs-maven/0.1.0/gs-maven-0.1.0.jar.

Add dependencies of project into the pom.xml file, in the <project> element.

<dependencies>
     <dependency>
         <groupId>joda-time</groupId>
         <artifactId>joda-time</artifactId>
         <version>2.2</version>
     </dependency>
 </dependencies>

This description will tell the maven to get the joda-time package as external library. You can specify a <scope> element to specify if the dependencies are required for compiling the code but will be provided during runtime by provided; or decalre the dependencies are only necessary for compiling and running tests.

When I compile the command, maven downloads the joda-time pakcage from https://repo.maven.apache.org/maven2/joda-time/. Does the maven use its domain when the group id does not contain the domain by default?

You can also create a project from scratch

mvn archetype:generate -DgroupId=com.mycompany.app -DartifactId=my-app -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

mvn scala:console can launch the scala console with the pom configurations (you have to specify maven-scala-plugin).

3 Directory layout

project
    src
        main
            java
            [scala]
            resources
                environment.properties
                environment.test.properties
                environment.prod.properties
                config
                    application.yml
                    application-dev.yml
                    application-[profile].yml
                [other]
        test
            java
            [scala]
    target
        classes
            the classes with the same structure as src/main
            the directories in src/main/resources
        test-classes
            the classes with the same structure as src/mainproject
        xxx.jar
    target

1 Monad

Honestly, I don’t understand it. But I collect my thoughts about it here. First, the monda is used to chain operations and if the first (preceding) operation fails, the whole chain stops. An example that I understand is the usage of for-expression with pattern match. If the pattern matches, go next operation, else skip this element.

Another point of view is about the unit operation. A monad flatMap with its unit operation will return the monad itself.

My conclusion:

A monad is a container that supports operation chain and follows the monad laws. The basic operations of this container are flatMap and unit, where flatMap actually accepts a function that maps each element to a container, then binds these containers together; unit returns a container given an element. Monad laws guarantee the reliability of operation chain on these contains.

Monad seems like a design pattern that enables the operation chain along with pattern match in functional programming. Because flatMap chain with different functions is equivalent to nested flatMap, the for-expression is a syntax sugar based on flatMap chain. The advantage of this design pattern is avoding side effect.

Generator is created by yield as well as for-expression. Pay attention, there is no generator class/type/trait. The genrator created through for-expression has the same type of the sequence/iterator/stream used in for-expression. That proves the for-expression is a syntax sugar of monad types. So, guess what will happen when we create a for-expression using both list and stream?

val fibo:Stream[Int] = 1 #:: 1 #:: fibo.zip(fibo.tail).map(x=>x._1+x._2)
val l = List(1,2,3,4,5)
val lsg = for{x <- l; fib <- fibo} yield (x,fib)// infinite loop, crash the software/your machine
val slg = for{fib <- fibo; x <- l} yield (x,fib)// return a stream

Following the definition of Monad, the first mixture lsg actually makes l flatMap (x => fibo), which returns a List that tries to expand infinite stream fibo, hence block your machine. The second mixture slg returns a Stream that expand the list l, hence, works well. Besides, I have to clarify one thing: different monads demonstrated above all accept GenTraversableOnce as parameter and return monad of their own type. That is why they can be mixed together and the first expression decides the type of the final output of for-expression.

2 ScalaCheck & ScalaTest & JUnit

In the example provided by the oneline course, they used JUnit, ScalaTest and ScalaCheck together. In their example, the class of Properties is in src/main/scala and is called by another class defined under src/test/scala. In the class under test folder, many instances of the Properties class are created to check different children classes of the target class.

abstract class QuickCheckHeap extends Properties("Heap") with IntHeap {

  property("min1") = forAll { a: Int =>
    val h = insert(a, empty)
    findMin(h) == a
  }

  lazy val genHeap: Gen[H] = for {
    a <- arbitrary[Int]
    h <- oneOf(value(empty), genHeap)
  } yield insert(a, h)

  implicit lazy val arbHeap: Arbitrary[H] = Arbitrary(genHeap)

}
@RunWith(classOf[JUnitRunner])
class QuickCheckSuite extends FunSuite with Checkers {
  def checkBogus(p: Prop) {
    var ok = false
    try {
      check(p)
    } catch {
      case e: TestFailedException =>
        ok = true
    }
    assert(ok, "A bogus heap should NOT satisfy all properties. Try to find the bug!")
  }
  test("Binomial heap satisfies properties.") {
    check(new QuickCheckHeap with BinomialHeap)
  }
  test("Bogus (1) binomial heap does not satisfy properties.") {
    checkBogus(new QuickCheckHeap with Bogus1BinomialHeap)
  }
  test("Bogus (2) binomial heap does not satisfy properties.") {
    checkBogus(new QuickCheckHeap with Bogus2BinomialHeap)
  }
  test("Bogus (3) binomial heap does not satisfy properties.") {
    checkBogus(new QuickCheckHeap with Bogus3BinomialHeap)
  }
  test("Bogus (4) binomial heap does not satisfy properties.") {
    checkBogus(new QuickCheckHeap with Bogus4BinomialHeap)
  }
  test("Bogus (5) binomial heap does not satisfy properties.") {
    checkBogus(new QuickCheckHeap with Bogus5BinomialHeap)
  }
}

2.1 Tutorial of ScalaCheck

It is a tool for property-based testing for scala. It has NO external dependencies and integrated in the test frameworks ScalaTest. It can also be used standalone, with its built-in test runner. First, create a class that extends class org.scalacheck.Properties with the name of data object that you want to test. It is used for the library to generate test data to test your algorithm.

Second, create test case by

property("NAME_OF_FUNCTION") = forAll{
    CONTAINER_OF_DATA => TEST_CASE_WITH_TARGET_FUNCTION
}

forAll is the function org.scalacheck.Prop.forAll.

Third, to run the tests, you can put the properties in src/test/scala then use test task to check them.

2.2 ScalaTest

This is a test framework that integrates all together. ScalaTest provides many test styles: FunSuit, FlatSpec, FuncSpec, WordSpec, FreeSpec, Spec, PropSpec, FeatureSpec. They are mainly different on syntax styles. It is recommanded that the user creates a base class that extends all the required features (as a subset of all the provided features) and extends this base class through everywhere of the project to make the style consistent.

2.2.1 With JUnit

This class uses the junit framework for unittest. JUnit will invoke the class with @RunWith annotation (extend by type hierarchies) to run the tests. The annotation parameter in the example is a class of the type JUnitRunner. In fact, any class extends Runner is acceptabel. Notice that function classOf acts the same as obj.getClass().

Maven Running tests with command mvn test. In fact, you cannot run the tests with the example above. Solution is to change the name QuickCheckSuite to QuickCheckSuiteTest or TestQuickCheckSuite or QuickCheckSuiteTestCase to run the tests. Even I did not explicitly specify plugin surefire, maven uses this plugin to run test with my configuration. By default, maven following name convertions when looking for tests to run. So I have to change the name or modify the surefire configuration to apply new name convertion rules.

The annotation uses class of JUnitRunner as parameter. This class is provided by scala-test framework that connect scala-test and junit.

  • As mentioned on ScalaTest Maven Plugin, you can run tests without this annotation by using this plugin. TODO

2.2.2 With ScalaCheck

To work with ScalaCheck, you have to mix the trait org.scalatest.prop.Checkers to your test class. Checkers class contains check method to perform ScalaCheck property checks, which are provided in the class QuickCheckHeap in the example above. In fact, you can also use the check method in JUnit directly. This method will generate a large number of possible inputs to look for an input that the property does not hold.

3 Async

3.1 Exception => Future

Exception is handled by Try[T], which is used as return type like Option. But Try[T] maches Success with result or Failure with throwable. As Try[T] is also a monad, operations of element T are usually propagated to monad methods of Try[T] to get a new Try[U], map Try[T] by f should return a new Try[f(T)].

Future[T] is a container of Try[T]. User provide operations/callbacks that deal with Try[T] and future will call them at certain time. Future and Try form a nested monad, monad operations are propagated and you cannot see direct operation on T in the source code of scala library. But take it easy, usually, you won’t have to do it yourself.

To get/compose a future (that can be our job), therer are four methods:

  1. Use the member function of a Future such as flatMap
    1. flatMap, binary bind operation of futures. Register promise to this when failed of exception of call funtion; register promise to that when successed.
    2. onComplete, parameter is a function Try[T] => U. This is a funtamental funciton of map, foreach, failed, etc. It propagate the Try mechanism.
    3. collect
    4. recover
    5. fallbackTo
    6. zip
    7. andThen
  2. Create a promise and register the complete method to current futures, then return a new Future of this promise.

    Promise acts like a mailbox, to compose a new future, the old future has to put the result, which is a Try[T], into the promise by complete method. Then the new future will call its callback when the complete of promise is called.

    In my opinion, promise can be merged with feature as one unit. Because it is more straight to think in the way that: feature calls the callbacks when it is completed. In fact, the DefaultPromise is actually extends the Future trait. Of course, the designers of scala have their proper reason.

    Currently, I think promise is used in race (concurrent processing of futures results) or to provide a new future for initailization of fold operations.

  3. Use for-expression, which is a syntax sugar of flatMap
  4. Use async...await grammar.
    1. The scala implementation of await is translated to onComplete by the macro.
    2. await method throw directly the excepetion when future failed.

3.2 Iterable => Observable

Try and Future are dual, so as Iterable and Observable. For iterable, you can get an iterator and using hasNext and next to get elements. For Obserable, you use Subscribe and onNext, onComplete to get elements.

flatten after map returns an observable that merges several streams randomly. concat after map returns an observable, in which, elements of each stream are grouped together sequentially as well as the order of the streams.

Don’t understand well the way of creating observable with a function from Observer to Subscription -_-!

3.2.1 Subject

It works like promise, which is used as bridge or chain between Obsever and Observable. Four types of subjects are provided:

  1. PublishSubject: send current value.
  2. BehaviorSubject: cache the latest output.
  3. AsyncSubject: cache the final output. You only get the final output.
  4. ReplaySubject: cache the whole history.

3.2.2 Notification

Like Try

3.2.3 Subscription

a handle that enables process to stop receiving messages from Observable.

See example POM Configuration.

1 Identify our project

Fully qualified name of project is “<groupId>:<artifactId>:<version>”. If groupId is a domain, it will be transformed into directory path while downloading. Download something with groupId=org.X.A from url.com will download things from url.com/org/X/A in the end.

2 Inheritance

When we have two projects, A and B, A is supposed to be parent project of B. You have to specify the pom.xml of project B to indicate that the parent of B is A with the groupId, artifactId and version of A. You can remove the groupId and version of B to make A and B have the same /groupId and version. One can also indicate the <relativePath> for parent project to put the A and B in the same directory level.

B inherite the dependencies from A.所以parent用来继承POM,dependency 用来解决代码依赖. parent的项目在代码上没有关系.

The packages downloaded by maven are put under HOME.m2/repository/.

所有 POM.xml 里面的元素都可以用 ${project.A.B.C} 的方式进行引用.比 方说如果在 XML 根路径下有这样的结构 <A><B>lalala<C>blabla<C/></B></A>, ${project.A.B.C} 就等于 blabla.

3 Tasks

首先,maven可以使用一系列的plugin来支持各种功能,包括发布,运行测试等 等.在使用 spring-boot-maven-plugin 的时候,可以用 mvn spring-boot:run 来运行程序. spring-boot 应该是plugin的名字, run 则是定义好的task. 在使用 maven-scala-plugin 的时候,如果加入以下设 置

<executions>
    <execution>
        <goals>
            <goal>compile</goal>
            <goal>testCompile</goal>
        </goals>
    </execution>
</executions>

运行 mvn scala:testCompile 则只会编译test部分的代码.

1 Lambda function

  • function is value,
    val f = (x:Int) => x*x
    

    One difference between def f = ... and val f = ... is: def is evaluated on call and returns a new function instance every time.

    val test: () => Int = {
        println("val")
        val r = util.Random.nextInt
        () => r
    }
    // returns the same result every time you call test(),
    // "val" will be printed after definition
    
    def test: () => Int = {
        println("def")
        val r = util.Random.nextInt
        () => r
    }
    // returns the difference results every time you call test(),
    // "def" will be printed every time you call it.
    
    val tt = test
    // acts like the first example
    
    val test = () => util.Random.nextInt
    def test = () => util.Random.nextInt
    // they are the same from behavior, you have to call the function by test()
    // the def declaration actually defined a function that returns a function.
    // if you type test without parenthesis, you will receive a function value.
    
    def test = util.Random.nextInt
    // this line directly assign the test function to the nextInt function,
    // you can call the function directly by name without the parenthesis
    

    method/expression starts by def will be evaluated when you call obj.f while those starts by val will be evaluated once. In addition, if you declare a val that assigns to def, the def will be evaluated immediately, and the new val acts like the first case in the examples above.

  • Abbreviation:
    val l = List(1,2,3)
    l.map((x) => x*x)
    l.map( x => x*x) // ignore the parenthese
    l.map( _ % 2 == 0) // ignore the parameter, using the placeholder for expression
    l.map(f(_)) // using the placeholder for function
    l.map(f) // ignore the placeholder for function
    

2 Type hierachy

abstract class is like Java, trait is like interface in Java. However, trait can have parameters and defined methods. In addition, trait is not abstract class since it cannot has constructors.

  • Logical problem: is List<Parent> the parent of List<Childe>?

    It is not true for mutable collection.

    var s = new Array[Man](new Man())
    var t:Array[Human] = s
    t[0] = new Woman()
    var m:Man = s[0]// what is it? man or woman?
    

    But we can use imutable collection.

  • Covariant, Contvariant, Invariant
    • Covariant: defined by Class<+T>, Class<Parent> is the parent of Class<Child>
    • Contvariant: defined by Class<-T>, Function<Parent> is the child of Function<Child>

    The principle is: child type can do anything that can be done by parent type.

  • Check Rules and Boundary

    To avoid conflict, the compile force the +T can only be used for returned type, the -T can only be used for argument type. A method to by pass this problem is using boundary.

    class List<+T>{
         def prepend[U >: T](U elem)
    }
    

    Notice that in generic type definition, one can use A>:B and A<:B to add constraint to the meta type.

3 Pattern match

Case class enables you to make classes with different parameters

abstract class CodeTree
case class Fork(left: CodeTree, right: CodeTree, chars: List[Char], weight: Int) extends CodeTree
case class Leaf(char: Char, weight: Int) extends CodeTree

For the classes listed above, you cannot directly access to any paramter (field) of Fork or Leaf. You have to use case Fork(a,b,_,_) to access to the paramters.

case match can also be used in lambda function as a short version.

List.map(p => p._1 * p._2)

List.map({case (x,y) => x * y})
//List.map{case (x,y) => x * y} also works

Option calsss used for the condition that the returned type can be empty/null, etc.

If you want to match a variable in context, you have to weither use a capitalized variable name Variable or wrap it by backticks `variable`.

4 Collections

  • List is a chain while Vector is a tree that each node is an array that contains 32 elements. Vector groups in depth slowly log_{32}(N). Important functions: map, filter, reduceRight, reduceLeft, foldRight, foldLeft, flatten, flatMap, take, drop, takeWhile, =dropWhile, span, group, groupBy.

    Sequence as parameter list is represented by this(par: T*).

  • Map.
    • The + and ++ operations overwrite the existing key.
    • Initialized by Map(k1->v1, k2->v2, k3->v3)
    • Set default value by withDefaultValue, this function returns a new map (everything in this course are immutable).

5 Stream, Iterator, Generator and lazy

5.1 Stream

Stream, acts like sequence, but it does not evaluate until be called. It can be used to ignore unecessary computation and coding infinite concepts.

val fibo:Stream[Int] = 1 #:: 1 #:: fibo.zip(fibo.tail).map(x=>x._1+x._2)
println(fibo.take(10).toList)

def from(n: Int): Stream[Int] = n #:: from(n+1) // infinite integer stream starts from n
// recursive stream
def sieve(s: Stream[Int]): Stream[Int] =
   s.head #:: sieve(s.tail filter (_ % s.head !=0)) // stream of primes 质数

val primes = sieve(from(2))
primes.take(10).toList // the first 10 primes starts from 2.

Three ways of using stream

  • Transform from other collections, then you use functions like map, etc. to generate new stream.
  • elem #:: Stream
  • Transform from iterator, use for loop to create an iterator with what you want and then convert it into a stream.

Because Stream has consistent API as List/Seq/Vector, you can use it as if you have a collection that contains everything.

5.2 Iterator

The difference between stream and iterator is stream memories the values while iterator doesn’t.

Iterator can be used in for-expression. For-expression can also use pattern match.

for{ pattern(x) <- seq; pattern2(y) = x(k); ...} yield ...

As show in the example above, you can apply pattern to loop on the elements of sequence, you can even add some simple statements in the for-expression. It is equivalent to:

seq.withFilter({case pattern(x) => true; case _ => false})
   .map({case pattern(x) => x(k)})
   .withFilter({case pattern2(y) => true; case _ => false})
   .map({case pattern2(y) => y})
   ...

The function withFilter returns a FilterMonadic, which provides four functions flatMap, map, foreach, withFilter. It works like list of call backs in programming point of view. From wikipedia: /In functional programming, a monad is a structure that represents computations defined as sequences of steps: a type with a monad structure defines what it means to chain operations, or nest functions of that type together.

5.3 lazy val

lazy val is only evaluated once when called. You can also define a lazy parameter by def myFUnc[T](param: => T), then the parameter will be evaluated when it is called in myFunc if the param is returned by an expression/function.

1 Two-phase commits

Two-phase commits is used to update multiple documents as a fake atomic operation. The principle of two-phase commits is creating temporary, inter-media records to support rollback operations

2 Concurrency control

The first method is adding a label to indicate the current accessing application.

The second method is using the old value of files to update as a part of query to ensure that the target fields of current document is not updated.

var myDoc = db.COLLECTION.findOne(condition);
if(myDoc){
   var oldValue = myDoc.value;
   var results = db.COLLECTION.update(
       {
           _id: myDoc._id,
           value: oldValue
       },
       {
           $inc:{value: -fee}
       }
       )
}

Another possible option is: if you want to deduct an amount of money from an account and the rest money should not be negative, you can use following commands:

db.COLLECTION.update({_id: id, value: {$gte: fee}}, {$inc:{value: -fee}})

It means the account will be updated only if the value is enough.

The third method is add an unique field (version) to the document.

3 Model

3.1 Relationships between documents

3.1.1 One-to-One Embeded

Example: patron and address. User informations should be put together, such as the account info (password, email) and application relative info (balance in account).

3.1.2 One-to-Many Embeded/References

It is used for one-to-few such as the authors of a book. One can use array in a document to model such relations. However, if the values of embedded elements are few, it is better to save them in another documents using references. For example, the publishers of books. Another principle is DO NOT create large arrays in document. Large arrays in document is inefficient for three reasons:

  • When document size is increased, the MongoDB will move the document, which lead to rewriting the entire document.
  • Indexing the elements in array is inefficient.
  • Query a large document for a small part of array is inconvenient.

3.2 Tree Structures

3.2.1 References

  • Parent Refs.

    Save each node as a document that contains the reference to the parent

  • Child Refs

    Save each node as document that contains an array of references to children nodes.

  • Extension

    It is also useful to add additional relations such as ancestors of nodes into the document.

    Another way to accelerate the searching of a tree is materialize paths. This method add a path attribute that describe the path using a string. Searching such string requires using of regular expression, but is faster than the previous solution.

3.2.2 Nested Sets

This method is tricky. It likes a binary tree that use values to indicate the relative nodes.

 db.categories.insert( { _id: "Books", parent: 0, left: 1, right: 12 } )
 db.categories.insert( { _id: "Programming", parent: "Books", left: 2, right: 11 } )
 db.categories.insert( { _id: "Languages", parent: "Programming", left: 3, right: 4 } )
 db.categories.insert( { _id: "Databases", parent: "Programming", left: 5, right: 10 } )
db.categories.insert( { _id: "MongoDB", parent: "Databases", left: 6, right: 7 } )
 db.categories.insert( { _id: "dbm", parent: "Databases", left: 8, right: 9})

 var databaseCategory = db.categories.findOne( { _id: "Databases" } );
 db.categories.find( { left: { $gt: databaseCategory.left }, right: { $lt: databaseCategory.right } })

One can retrieve the children nodes by using the left and right indicators.

3.3 Index

One of the main benefit of indexing is sorting, the application can quickly find the matched document by traveling through the ordered indices very quickly

  • Single field index
  • Compound key index
    Note that if one set compound index by
    db.COLLECTION.ensureIndex({key1: 1, key2: -1})
    

    It can help the sort funciton using {key1: 1, key2: -1}, {key1: -1, key2: 1}, but cannot help sorting with the key {key1: 1, key2: 1}.

  • Multi-key index
    It refers to using array as index. You cannot using compound key for two arrays.
  • Hased index
    It is not compatible with multi-key index.
  • Geospatial index
    This index is used to index the geometric data, such as lines, points, shapes.
  • Text index
    Support text search with language stop words, will use very large space.
  • Hashed index
    Compute the hash for entire document while ??collapse?? the sub-document.

Index also support properties like:

  • TTL index is used to remove outdated index
  • Uniqu index, which reject duplicate value of indexed filed for all the document. ??Can we use it to ensure that elements in array has unique value in document (can be repeat in different document)??
  • Sparse index, documents without the field are not indexed. By default, these documents will be indexed to null.

3.4 Others

  • Atomic operation update, findAddUpdate, remove are atomic. Put fields in one document can ensure the write operations are atomic.
  • Support keyword search Putting strings in an array, then create a multi-key index enables keyword search.

    Keyword search cannot provide NLP/IE functions, such as stemming, synonyms, ranking, which are usually required by search engine.

  • Document limit
    • Field names cannot start with $, cannot contain ., cannot contain null.
    • Field value has a maximum index key length limit to be used as index.
    • Maximum document size is 16M. GridFS supports large file, which is represented by a group of files that contain different pieces of contents.
    • _id field may be any BSON data type except the array. It is ObjectId by default.
  • DBRef

    DBRefs are used to representing a document rather a specific reference type. It contains the name of collection (optional: database name).

1 Insert

db.COLLECTION.insert(doc/docs)

You can insert one document or a list of documents. When you insert a list of documents, the operation is not atomic. The returned result shows the statistics of the write operations including the write errors and the number of successfully inserted documents.

2 Find

db.COLLECTION.find({name: value, 'name1.name2.name3':value/condition})

Check all the elements in array when ‘name#’ corresponds to array. Special conditions

  • $in, $ne, $nin, $lt, $gt, $lte, $gte
  • $and, $or, $not, $nor
  • $exists, $type
  • $regex, $text, $where, $mod
  • $all, $elemMatch, $size
  • $, $slice

Difference between with/without $elemMatch is that: $elemMatch restrict that one element in array should match all the conditions, while without $elemMatch, different elements in the array can match different part of the conditions.

Match element / array can be done by query {name: value} when value is a single variable/object or an array.

Elements in array can be retrieved by index db.COLLECTION.find({'name.0': value}).

3 Select

  • get attribute
    db.COLLECTION.find(***, {name:1})
    
  • get elements in array
    db.COLLECTION.find(***, {'name1.name2':1})
    

    This projection will returns all the elments with the path ‘name1.name2’ when any one in the path represents an array.

    db.COLLECTION.find(***, {name1: {$elemMatch: {name2:value}}})
    

    $elemMatch can be used in the second argument to limit the returned elements in array. However, it CANNOT be used in nested array and only returns the FIRST element.

  • get elements in nested array.

    Aggregation framework/unwind. Aggregations are operations that process data records and return computed results. Pipeline

    You can expand the nested arrays by several $unwind operations to get a large number of documents that each one contains a single combination of elements in different levels of nested arrays. Then match the elements you need and maybe regroup them together. However, if you want to aggregate the matched elements in nested arrays into array after $unwind operation, it is very complex.

    However, you can just return the tasks by $group operation that matches, for example, the nested element attributes.

4 Update

  • update attribute:
    db.COLLECTION.update(find condition, {$set: {name: value}})
    

    $inc increase number, etc. upsert will insert element when it does not exist in collection.

  • update array:
    db.COLLECTION.update(find condition, {$push/$pull: value})
    
    db.students.update(
      { _id: 1 },
      {
      $push: {
              scores: {
                 $each: [ { attempt: 3, score: 7 }, { attempt: 4, score: 4 } ],
                 $sort: { score: 1 },
                 $slice: -3
              }
      }
    
    })
    

    The command above append 2 elements, then sort them by ascending score, then keep the last 3 elements of the ordered array.

  • update element in array:
    db.COLLECTION.update({"name1.name2":value}, {$set:{'name1.$.name2:value}})
    

    It only works for one match.

    If you want to update multiple elments, use forEach function.

    db.Projects.find().forEach(
        function(pj){
            pj.groups.forEach(
                function(gp){
                    gp.tasks.forEach(
                        function(task){
                            if (task.name.match(/update/)){
                                task.weight=5;
                            }
                        });
                });
            db.Projects.save(pj);
        }
    )
    
  • update element in nested array.

    Impossible for only one query. The issue is reported on 2010, but is still there on 2015…

Functional key words:

  • $currentDate, $inc, $max, $min, $mul, $rename, $setOnInsert, $set, $unset
  • $, $addToSet, $pop, $pullAll, $pull, $pushAll, $push
  • $each, $position, $slice, $sort

5 Remove

  • remove document
    db.COLLECTION.remove(condition)
    

    remove all documents when condition is not provided.

  • remove colleciton
    db.COLLECTION.drop()
    
  • remove database
    use DATABASE
    db.dropDatabase()
    
  • remove attribute
    db.COLLECTION.update(find, {$unset: {name: value}})
    

    value can be 1, “”, true/false.

  • remove element in array
    db.COLLECTION.update({"name1.name2":value}, {$unset:{'name1.$.name2':1}})
    

6 Commands

show dbs use DB show collection help db.help() db.COLLECTION.help()

7 Query Cursor Methods

  • count() // count of documents returned in a curcor
  • explain() // report
  • hint() // Forces MongoDB to use a specific index for a query
  • limit() // constraints size of returned results
  • skip() // skip the first number of documents
  • sort()
  • toArray()

8 MapReduce

db.COLLECTION.mapReduce(mapFunction,
                        reduceFunction,
                        {
                            query: "query performed at the beginning",
                            out: "out collection name",
                            finalize: finalizeFunction
                        })

Query -> map -> reduce -> finalize -> out.

  • The reduce function muse return object with the same type of the output of the map function.
  • The order of emited elements should not affect the output of reduce function.
  • The reduce function must be Idempotent, which means f(f(x))=f(x).

Table of Contents

在工具描述中,主要介绍经常用到的库,框架,前后端开发用到的平台.

1 后端

Node.js 是一个JavaScript的平台,它提供了一个脱离浏览器的JavaScript运 行环境,同时也提供了一系列运行库和各种包. npm(nodejs package manager)则是这各平台的包管理系统.它通过读取package.json文件来安装项 目所需的依赖(就跟maven的pom.xml一样).

bower ,是nodejs之中的一个组件,本身用来专门管理前端开发所需要的包 (JavaScript,CSS).它与npm的区别在于管理对象的不同以及bower使用的是扁 平化的依赖关系而npm使用树形依赖.bower读取bower.json来安装依赖.

yoeman , 安装时叫做 yo,是一个用来创建项目骨架的工具.它根据 yeoman-generator,也就是模板来创建文件架结构.模板需要用npm来安装,比 方说

npm install -g generator-gulp-angular

模板名字以 generator 开头. 使用时运行

mkdir [app-name] && cd $_
yo gulp-angular [app-name]

来生成一个以generator-gulp-angular为模板的项目骨架,项目名字为 [app-name]. 就刚刚使用的这个模板为例,它在创建的时候会询问你需要什 么样的angular版本,哪一种CSS样式库(bootstrap, material),哪种angular 的实现方式(angular-strap),等等.

grunt / gulp 用来运行打包,压缩,发布等任务.在刚刚的例子中,我们使 用的是基于gulp的模板,所以要用gulp来运行以上任务.生成模板的事后,已经 在项目目录下建立了一个名为gulp的目录以及一个叫做gulpfile.js的文 件.gulp目录下面包含一系列js文件. gulpfile.js里面引入gulp库,定义一些 目录(src)的路径,并引入gulp目录,从而使得gulp目录下的js文件被gulp用来 进行相应的操作.这个模板的github说明上还指出一系列可选的特性,但是目 前看不出来在哪里定义的.

2 前端

  • Javascript
    • AngularJS
    • UI-Route
  • UI Framework
    • Bootstrap
    • Angular Material
  • CSS Preprocessor
    • Sass(Node.js)
    • Less
  • JS Preprocessor
    • ES6
    • TypeScript
    • CoffeeScript
    • AtScript
  • HTML template
    • .jade
    • .haml