11/19/2021
What is Geospatial Data?
Any Data with a Geospatial Element is Geospatial Data
It can be a point:
It can be a line:
It can be a polygon:
| Geospatial Element | Sub-element | Example |
|---|---|---|
| Point | xy coordinates | Lina Building Point on a Map |
| Line | 2 Points (2 xy coordinates) | Average Speed of Road Segment on Minfu Road |
| Polygon | Lines (3 xy coordinates or more) | Population within Shangpu Center Boundary |
More like units for coordinates. For example, Metric System is used widely in the world, except not in the US.
WGS84 Coordinates System is widely used in the US and globally…
But not in China.
Here in China, we use GCJ02 (国测局2002, if you are wondering), we call it “Mars Coordinates System” (火星坐标系).
Or BD09, Baidu Coordinate System
When doing geospatial analysis, first thing to do is to choose the right Coordinate System, otherwise…
Geocoding: the process of turning address to coordinates
Location-based Service (LBS) is a software service which uses geographic data and information to provide services or information to users.
Navigation: Baidu map, The app need to know where you are to provide navigation.
Social Networking: Wechat, Nearby Friend, etc.
Location-based Advertising: Dianping, Push some local restaurants when you are in an area.
Tracking: Almost every app on your phone.
LBS data is one of the side product produced by LBS providers, by collecting and processing millions of user’s locations from the Location-based services they provide. LBS data contains user’s information from an area:
Flow and Population
Gender, Age, Education, Purchase Power, Housing Price, Favorite Sport, Favorite Apps….
Customer Origin
And Many More
A new Jordan flagship store is about to be opened in Beijing Sanlitun area in Chaoyang District, and the purpose of this study is to help teams to have a better understanding on current landscape of Jordan related purchases in Bejing Urban area, to illustrate the customer flow within the Sanlitun commercial area, and to model how Beijing Live customers’ Jordan purchasing behavior.
The study focuses on four main aspect of the Jordan store:
Exploratory Data Analysis on orders and members within Beijing Urban area (study area).
Exploratory Data Analysis on orders and members of Beijing Live store. This store is also located in Sanlitun.
A binary model on the probability to buy a Jordan product for Beijing Live customers.
A geospatial analysis on the Location-based service data for Sanlitun commercial area.
Algorithm used: Logistic Regression Classification
\(Logit(P(Y = Buy Jordan(1))) = log(\frac{p}{1-p}) = \\ \beta_{0} + \beta_{1} Gender + \beta_{2}Age Group + \beta_{3} Basketball + \beta_{4} Football + \beta_{5} live \dots + \beta_{n}HighValue\)
Logistic Regression is commonly used for binary classification problems. one advantage of this algorithm is that it gives \(\beta\) for each term, and the signs can explain the effect of each term on dependent variable.
| Variable | Log Odds(\(\beta\)) |
|---|---|
| Male | 1.053 |
| Basketball | 1.305 |
| Live_basketball | 0.260 |
| Live_football | -0.243 |
| High_value_member | 1.501 |
| Age Group | Ratio | Percent of Shanghai Average |
|---|---|---|
| 18-24 | 35.7% | 183% |
| 25-30 | 29.6% | 143% |
| 31-35 | 14.6% | 81% |
What is Machine Learning?
Machine learning is the task of making computers more intelligent…
So that it help us to identify patterns (Unsupervised Learning), or predict based on previous patterns (Supervised Learning).
Your Computer is like a baby. it read about a thing with a head, 2 arms and 2 legs, walks up straight, and another thing with a head, but 4 legs. It noticed that major difference and tells you they are two different things. This is Unsupervised Machine Learning
Now, since your computer has no knowledge about what a human looks like. And you tell it: “a thing with a head, 2 arms and 2 legs, walks up straight is a human”. Next time it read about a thing with a head, 2 arms and 2 legs, it will think that description is like a human This is Supervised Machine Learning
Now, your computer has a camera installed, which means your baby now has eyes. When it “sees” a thing with a head, 2 arms and 2 legs, walks up straight for the first time, and you tell it: “that thing you are seeing is a human”. Next time, your computer sees something similar , it will reconize that as human automatically. This is Computer Vision.
Full Path Analysis 1.0 by ARO team in 4 doors
A trained computer identifies a human - Traffic Count
Full coverage of cameras tracks that human - Path, Zone Traffic, Heatmap
Record how long a human stays in a location (zone, shelf, etc.) - Dwell Time
Record how long a human interacts with athletes. - Athletes Engagement
With Computer Vision, we have the most direct and accurate metrics to test based on individual behaviors of customers.
Any change in in-store communication can be evaluated and tested using Individual Dwell Time.
Which one of the two signs is more effective?
No way to answer this with transaction data or any store level data.
Avoid Bias: Don’t test “time”
Today, we went through entire consumer journey using data.
Geospatial and LBS give you insights on where your consumer is coming from and what kind of consumer you are expecting to see in the trade zone where store is located. So you can better prepare your store to fit your consumers (concept, merch, etc.)
Computer Vision gives you insights you the consumer behaviors in store. So you can better serve the consumer with effective in-store communication, VMS, design, shelf and fitting room management.
In the future, connected product features will work with Computer Vision. So we know what product is attracting consumer to have a higher dwell time at a shelf.
In-store Signage A/B Testing on Global Testing Page: https://confluence.nike.com/pages/viewpage.action?spaceKey=GlobalNDSSA&title=Tests+We%27ve+Run
Jordan 2.0 Study: http://www.zhangyunhai.com/nike/research/jordan/
This Presentation: http://www.zhangyunhai.com/nike/research/pres1119/