lifelong-robotic-vision.github.io/index.html at master · ivipsourcecode/lifelong-robotic-vision.github.io · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
---
layout: home
---

    <script type="application/ld+json">
    {
      "@context":"https://schema.org/",
      "alternateName": ["openloris"],
      "@type":"Dataset",
      "name":"OpenLORIS-Object Dataset",
      "description":"The recent breakthroughs in computer vision have benefited from the availability of large representative datasets (e.g. ImageNet and COCO) for training. Yet, robotic vision poses unique challenges for applying visual algorithms developed from these standard computer vision datasets due to their implicit assumption over non-varying distributions for a fixed set of tasks. Fully retraining models each time a new task becomes available is infeasible due to computational, storage and sometimes privacy issues, while naive incremental strategies have been shown to suffer from catastrophic forgetting. It is crucial for the robots to operate continuously under openset and detrimental conditions with adaptive visual perceptual systems, where lifelong learning is a fundamental capability. However, very few datasets and benchmarks are available to evaluate and compare emerging techniques. To fill this gap, we provide a new lifelong robotic vision dataset (“OpenLORISObject”) collected via RGB-D cameras. The dataset embeds the challenges faced by a robot in the real-life application and provides new benchmarks for validating lifelong object recognition algorithms. This dataset could support object classification, detection and segmentation. The 1 st version of OpenLORIS-Object is a collection of 121 instances, including 40 categories daily necessities objects under 20 scenes. For each instance, a 17 to 25 seconds video (at 30 fps) has been recorded with a depth camera delivering around 500 to 750 frames (260 to 600 distinguishable object views are manually picked and provided in the dataset). 4 environmental factors, each has 3 level changes, are considered explicitly, including illumination variants during recording, occlusion percentage of the objects, object pixel size in each frame, and the clutter of the scene.  Note that the variables of 3) object size and 4) camera-object distance are combined together because in the real-world scenarios, it is hard to distinguish the effects of these two factors brought to the actual data collected from the mobile robots, but we can identify their joint effects on the actual pixel sizes of the objects in the frames roughly. The variable 5) is considered as different recorded views of the objects. The defined three difficulty levels for each factor are shown in Table. II (totally we have 12 levels w.r.t. the environment factors across all instances). The levels 1, 2, and 3 are ranked with increasing difficulties.",
      "url":"https://lifelong-robotic-vision.github.io",
      "sameAs":"https://lifelong-robotic-vision.github.io",
      "keywords":["continual learning","lifelong learning","robotic"],
      "license" : "https://freedomdefined.org/Licenses/CC-BY-4.0",
      "creator": [
      {
        "@type": "Person",
        "sameAs": "http://orcid.org/0000-0000-0000-0000",
        "givenName": "Qi",
        "familyName": "She",
        "name": "Qi She"
      },
      {
        "@type": "Organization",
        "sameAs": "https://ror.org/03q8dnn23",
        "name": "City University of Hong Kong"
      },
      {
        "@type": "Organization",
        "sameAs": "https://ror.org/03cve4549",
        "name": "Tsinghua University"
      }],
      "DataDownload":"https://docs.google.com/document/d/1KlgjTIsMD5QRjmJhLxK4tSHIr0wo9U6XI5PuF8JDJCo/edit#",
      "includedInDataCatalog":{
         "@type":"DataCatalog",
         "name":"OpenLORIS Group"
      }
    }
    </script>
<!---
<div class="user-details">
<h1> Lifelong Robotic Vision Challenge </h1>
<p style="text-align: justify;"> &nbsp;&nbsp;&nbsp;&nbsp; Benchmarking Scene Understanding, including Lifelong Object Recognition and SLAM, and Lifelong or Continual Learning for Robotic Vision. Powered by the Robot Innovation Lab, <b>Intel Labs China</b>, Department of Electronic Engineering, <b>Tsinghua University</b>, and Department of Electronic Engineering, <b>City University of Hong Kong</b>. You can learn more about our scope and motivation for this challenege at <a href="/about"> about page</a>. For joining our IROS 2019 competition (either Lifelong Object Recognition or Lifelong SLAM tasks), please contact us via: <a href="mailto:qi.she@intel.com">qi.she@intel.com</a> or <a href="mailto:xuesong.shi@intel.com">xuesong.shi@intel.com</a>.</p>

<div class="analytics"  style="border: solid lightgrey; border-radius: 5px;">
	<h3> Analytics </h3>
	{% include clastrmap.html %}
	<p> <small> If you are not seeing a map, please disable Ad block </small></p>
</div>
</div>


<div class="permlinks">

<h1>Competition Details</h1>
<dl>
	{% for post in site.posts limit:4 %}
	<dt><code>{{ post.date | date_to_string }} </code><i class="fas fa-angle-double-right" aria-hidden="true"></i><a href="{{ post.url }}">{{ post.title }}</a> &nbsp;{% include status-indicator.html status=post.status%}
	{% if post.description %}
 <dd style="text-align: justify">{{ post.description | markdownify }}
    </dd>
	{% endif %}
	{% endfor %}
	<p>... <a href="/blog">Full Posts List</a> </p>
</dl>

<h1>News</h1>
<ul>
	<li><b>Nov 2019</b>: We are happy to announce that IROS 2019 is hosting our competition. Participants of our Lifelong Robotic Vision Challenge (two tasks) will present their approaches and results, and we will announce the competition winners at the IROS 2019.</li>
	<li><b>June 2019</b>: Our Lifelong Robotic Vision Dataset 1.0 (LRV1.0) will be released !</li>
</ul>

<h1>Organizers</h1>
<ul>
	<li> Dr. Qi She (<b>Intel Labs China</b>)
	<li> Dr. Xuesong Shi (<b>Intel Labs China</b>)
	<li> Dr. Yimin Zhang (<b>Intel Labs China</b>)
	<li> Prof. Fei Qiao (<b>Tsinghua University</b>)
	<li> Prof. Rosa Chan (<b>City University of Hong Kong</b>)
</ul>
</div >
--->

<!---
# Lifelong Learning Definition (Opinions are my own)

The first and the most crutial thing needed to be seriously considered is what is **"lifelong learning"** in robotic vision area. Below we have summaried some senarios that should be included under this definition. Robot continuously learns


- the instances of the known class, and improves classifier based on accumulated intances, it is an enhancement process.
- the novel class, which has not been appeared in the previous learning procedure. The model should be able to increase class-incremental capability.
- multiple tasks which are highly relevant, such as from I.I.D assumptions, we can say the tasks are within the same environment.
- multiple tasks which are from None I.I.D situations.--->


<div class="user-details">
<h1>Lifelong Robotic Vision</h1>

<img src="https://lifelong-robotic-vision.github.io/about/Relation.png" alt="Human-Robot-Computer" width="520">

<p style="text-align: justify;">Humans have the remarkable ability to <strong><I>learn continuously</I></strong> from the external environment and the inner experience. One of the grand goals of robots is also building an artificial <strong><I><q>lifelong learning</q></I></strong> agent that can shape a cultivated understanding of the world from the current scene and their previous knowledge via an <strong><I>autonomous lifelong development.</I></strong></p>

<p style="text-align: justify;">Recent advances in computer vision and deep learning techniques have been very impressive due to large-scale datasets, such as ImageNet, COCO, etc. The breakthroughs in object/person recognition, detection, and segmentation have heavily relied on the availability of these large representative datasets for training. <strong><I>However, robotic vision poses new challenges for applying visual algorithms developed from computer vision datasets due to their implicit assumption over non-varying distributions for a fixed set of categories and tasks.</I></strong> It is obvious that the semantic concepts of the real environment are dynamically changing over time. Specifically, in real scenarios, the robot operates continuously under open-set and sometimes detrimental conditions, which has the requirements for the lifelong learning capability with reliable uncertainty estimates and robust algorithm designs.</p>

<blockquote>
<h3><font face="sans-serif">Providing a robotic vision dataset collected from the real time-varying environments can accelerate both research and applications of visual models for robotics!</font></h3>
</blockquote>

<h1>Dataset and Competition</h1>

<img src="https://lifelong-robotic-vision.github.io/about/Night.gif" alt="Human-Robot-Computer" width="360" height="240">

<p style="text-align: justify;">We will utilize the unique characteristics of robotics for enhancing robotic vision research by using additional high-resolution sensors (e.g. depth and point clouds), controlling the camera directions &amp; numbers, and even shrinking the intense labeling effort with self-supervision. For accelerating the lifelong robotic vision research, we will provide <strong><I>robot sensor data (RGB-D, IMU, etc.) in several kinds of typical scenarios, like homes and offices, with multiple objects, persons, scenes, and ground-truth trajectory acquired from auxiliary measurements with high-resolution sensors.</I></strong> Not only the sensor information, scenarios, task types are highly diverse, but also our dataset embraces slow and fast dynamics in real life, which to our knowledge makes it <a href="{{site.url}}{{site.baseurl}}/dataset">the first real-world dataset under the <strong><I>lifelong robotic vision</I></strong> setting</a>.</p>

<p style="text-align: justify;">The major challenge for lifelong robotic vision is continuous understanding of a dynamic environment. In the level of objects, the robot should be able to learn new object models incrementally without forgetting previous objects. In the scene level, the robot should be able to incrementally update its world model without getting lost. Thus, we start from the particular research topics of lifelong object recognition and lifelong SLAM, provide benchmarks for both tasks, and organize competitions to accelerate related research. <a href="{{site.url}}{{site.baseurl}}/competition">The first competition will be held online from July to October 2019, with a workshop-like event at IROS 2019 in Macau on November 4.</a>.</p>

<h1>Vision and Expectation</h1>

<p style="text-align: justify;">Research outcomes. Research challenges or competitions should improve the state-of-the-arts by providing rich training/testing data and context information. Moreover, the realistic environments would enlighten the development of more practical and scalable learning methods. Our collected dataset should be able to provide potential modifications to the existing robotic vision contest that we believe will encourage these directions.</p>
<p style="text-align: justify;">Improving Participation. The purpose of our collected dataset and organized challenge in research is to provide both an opportunity to exchange ideas as well as a venue to evaluate and encourage state-of-the-art research. Lifelong Robotic Vision challenge is to encourage the participation of machine learning, robotics and computer vision researchers. Below we discuss practical suggestions to increase researcher participation.</p>

<!--# Posts

The posts are at different status.

| Status    | Meaning                                                      |
| --------- | ------------------------------------------------------------ |
| Completed | This post is considered completed, but I might edit it when I came up with something new. |
| Writing   | This post is being actively edited.                          |
| Paused    | This post is considered of low priority. I will come back to this post later. |
| Archived  | This post is outdated and I probably won't update it anymore. |>

# Sources

This website (source code [here](https://github.com/yk-liu/yk-liu.github.io)) uses these sources:

| Module                                                       | Mainly used in                                  | License/ TOS                                                 |
| ------------------------------------------------------------ | ----------------------------------------------- | ------------------------------------------------------------ |
| [Particle.js](https://github.com/VincentGarreau/particles.js) | Homepage                                        | [MIT](http://opensource.org/licenses/MIT)                    |
| [Visitor map](https://clustrmaps.com/)                       | Homepage, footer                                | [TOS](https://clustrmaps.com/legal)                          |
| [Homepage and color scheme](https://github.com/nrandecker/particle) | Layout @ homepage, color scheme @ all pages     | [MIT](http://opensource.org/licenses/MIT)                    |
| [List of recent post](https://github.com/mdo/jekyll-snippets/blob/master/posts-list.html) | Homepage, Post index                            | [MIT](http://opensource.org/licenses/MIT)                    |
| [Search](https://github.com/christian-fei/Simple-Jekyll-Search) | Post index, Tags index                          | [MIT](http://opensource.org/licenses/MIT)                    |
| [Side bar](https://github.com/poole/lanyon)                  | Post, all pages with these elements             | [MIT](https://github.com/poole/lanyon/blob/master/LICENSE.md) |
| [Table of content](https://github.com/allejo/jekyll-toc)     | Post                                            | [BSD-3](https://opensource.org/licenses/BSD-3-Clause) or [MIT](http://opensource.org/licenses/MIT) |
| [Markdown vue theme and color scheme](https://github.com/blinkfox/typora-vue-theme) | Markdown theme @ Post, color scheme @ all pages | [Apache-2.0](http://www.apache.org/licenses/LICENSE-2.0)     |
| [Tags, Tag cloud, Tag page](https://hyunyoung2.github.io/2016/12/17/Tag_Cloud/) | Post, Post index, Tags index                    | [MIT](http://opensource.org/licenses/MIT), repo [here](https://github.com/hyunyoung2/hyunyoung2.github.io). Tag page inspired by [haixing-hu](https://haixing-hu.github.io/tags.html) |
| [Font size adjustment](https://codepen.io/robgolbeck/pen/yePRwa) | Post                                            | [MIT](http://opensource.org/licenses/MIT)                    |
| [comment](https://commentit.io)                              | Post                                            | [APGL-3.0](https://www.gnu.org/licenses/agpl-3.0.html)       |
| [404 T-rex game](https://github.com/wayou/t-rex-runner)      | 404 page                                        | from [Chromium source code](https://cs.chromium.org/chromium/src/components/neterror/resources/offline.js?q=t-rex+package), [license](https://chromium.googlesource.com/chromium/src.git/+/master/LICENSE) |
| [Encryption](https://github.com/robinmoisson/staticrypt)     | Secret Pages                                    | [MIT](http://opensource.org/licenses/MIT)                    |

Additional licensing information can be found [here](https://github.com/yk-liu/yk-liu.github.io/blob/master/LICENSE.md).

I mainly use [Typora](https://www.typora.io) to write my post.-->


<div class="row">
  <div class="col-xs-12">
    <h1>Organizers</h1>
  </div>
</div>

<img src="https://lifelong-robotic-vision.github.io/about/organizer.png" alt="Human-Robot-Computer" width="820">


<!--
<div class="row">
  <div class="col-xs-2">
    <a>
      <img class="people-pic" src="https://lifelong-robotic-vision.github.io/about/fig_Roger.jpg" />
    </a>
    <div class="people-name">
      <h4>Dr. Qi She</h4>
      <h4>Intel Labs China</h4>
    </div>
  </div>

  <div class="col-xs-2">
    <a>
      <img class="people-pic" src="https://lifelong-robotic-vision.github.io/about/fig_Xuesong.jpg" />
    </a>
    <div class="people-name">
      <h4>Dr. Xuesong Shi</h4>
      <h4>Intel Labs China</h4>
    </div>
  </div>

  <div class="col-xs-2">
    <a>
      <img class="people-pic" src="https://lifelong-robotic-vision.github.io/about/fig_Yimin.jpg" />
    </a>
    <div class="people-name">
      <h4>Dr. Yimin Zhang</h4>
      <h4>Intel Labs China</h4>
    </div>
  </div>

  <div class="col-xs-2">
    <a>
      <img class="people-pic" src="https://lifelong-robotic-vision.github.io/about/fig_Qiaofei.jpg" />
    </a>
    <div class="people-name">
      <h4>Prof. Fei Qiao</h4>
      <h4>Tsinghua University</h4>
    </div>
  </div>

  <div class="col-xs-2">
    <a>
      <img class="people-pic" src="https://lifelong-robotic-vision.github.io/about/fig_Rosa.jpg" />
    </a>
    <div class="people-name">
      <h4>Prof. Rosa H.M. Chan</h4>
      <h4>City University of Hong Kong</h4>
    </div>
  </div>
</div>
-->


<!--
<ul>
<li><p style="text-align: justify;">Dr. Qi She (<b><font face="sans-serif">Intel Labs China</font></b>)</p></li>
<li><p style="text-align: justify;">Dr. Xuesong Shi (<b><font face="sans-serif">Intel Labs China</font></b>)</p></li>
<li><p style="text-align: justify;">Dr. Yimin Zhang (<b><font face="sans-serif">Intel Labs China</font></b>)</p></li>
<li><p style="text-align: justify;">Prof. Fei Qiao (<b><font face="sans-serif">Tsinghua University</font></b>)</p></li>
<li><p style="text-align: justify;">Prof. Rosa Chan (<b><font face="sans-serif">City University of Hong Kong</font></b>)</p></li>
</ul>
-->

<h1>Acknowledgement</h1>
<p style="text-align: justify;">Thanks the following collaborators for contributing to the dataset: Chunhao Zhu, Dongjiang Li, Dion Gavin Mascarenhas, Feng (Eric) Fan, Ke Ou, Kwunyu Wu, Qinbin Tian, Qihan (Jack) Yang, Qinbin Tian, Qiwei Long, Rong Hong, Yuxin Tian, Yiming Hu, and Zhigang Wang.</p>

<p style="text-align: justify;">Thank Yingkai Liu for the great <a href="https://yk-liu.github.io/">website template</a>.</p>

<!---<h1>Sponsor</h1>--->

<h1>Further Reading</h1>

<ul>
<li><p style="text-align: justify;"><a href="https://mp.weixin.qq.com/s/_txt3Y9HJlNDFljDCjKODA">Incremental Learning Makes the Robot Smarter (in Chinese)</a></p></li>
<li><p style="text-align: justify;"><a href="https://mp.weixin.qq.com/s/9d0sbFdeAzgu81rzwDii9A">Towards Spatial AI: Building a World Model (in Chinese)</a></p></li>

</ul>

<h1>Join the Competition</h1>
<p style="text-align: justify;">For joining our IROS 2019 competition (either Lifelong Object Recognition or Lifelong SLAM tasks), please contact us via: <a href="mailto:qi.she@intel.com">qi.she@intel.com</a> or <a href="mailto:xuesong.shi@intel.com">xuesong.shi@intel.com</a></p>

<div class="analytics"  style="border: solid lightgrey; border-radius: 5px;">
	<h3> Analytics </h3>
	{% include clastrmap.html %}
	<p> <small> If you are not seeing a map, please disable Ad block </small></p>
</div>
</div >