Yes, the functions you're looking for are aptly named wait
and progress
.
from dask.distributed
import wait, progress
The progress
function takes any dask thing and renders a progress bar
>>> progress(x)[XXXXXXX................] 5.2 seconds
If you do not want a progress bar, or if you are in the Jupyter notebook, then you may want to separately use the wait
function, which will block until the computations finish.
wait(x)
Wait until computation completes, gather result to local process.,You can submit individual tasks using the submit method:,Use submit method to send individual computations to the cluster,Sometimes situations arise where tasks, workers, or clients need to coordinate with each other in ways beyond normal task scheduling with futures. In these cases Dask provides additional primitives to help in complex situations.
from dask.distributed import Client client = Client() # start local workers as processes # or client = Client(processes = False) # start local workers as threads
def inc(x):
return x + 1
def add(x, y):
return x + y
a = client.submit(inc, 10) # calls inc(10) in background thread or process
b = client.submit(inc, 20) # calls inc(20) in background thread or process
>>> a
<Future: status: pending, key: inc-b8aaf26b99466a7a1980efa1ade6701d>
>>> a
<Future: status: finished, type: int, key: inc-b8aaf26b99466a7a1980efa1ade6701d>
>>> a.result() # blocks until task completes and data arrives
11
c = client.submit(add, a, b) # calls add on the results of a and b
futures = client.map(inc, range(1000))
You can turn any dask collection into a concrete value by calling the .compute() method or dask.compute(...) function. This function will block until the computation is finished, going straight from a lazy dask collection to a concrete value in local memory.,The collection is returned immediately and the computation happens in the background on the cluster. Eventually all of the futures of this collection will be completed at which point further queries on this collection will likely be very fast.,Concrete values in local memory. Example include the integer 1 or a numpy array in the local process.,Because this is a single future the result must fit on a single worker machine. Like dask.compute above, the client.compute method is only appropriate when results are small and should fit in memory. The following would likely fail:
>>> df = dd.read_csv('s3://...') >>>
df.value.sum().compute()
100000000
>>> df.compute() MemoryError(...)
>>> df = dd.read_csv('s3://...') >>> total = client.compute(df.sum()) # Return a single future >>> total Future(..., status = 'pending') >>> total.result() # Block until finished 100000000
>>> future = client.compute(df) # Blows up memory
>>> df = dd.read_csv('s3://...') >>> df.dask # Recipe to compute df in chunks { ('read', 0): (load_s3_bytes, ...), ('parse', 0): (pd.read_csv, ('read', 0)), ('read', 1): (load_s3_bytes, ...), ('parse', 1): (pd.read_csv, ('read', 1)), ... } >>> df = client.persist(df) # Start computation >>> df.dask # Now points to running futures { ('parse', 0): Future(..., status = 'finished'), ('parse', 1): Future(..., status = 'pending'), ... }
futures = client.scatter(args) # Send data future = client.submit(function, * args, ** kwargs) # Send single task futures = client.map(function, sequence, ** kwargs) # Send many tasks
© Copyright 2017-2022 H2O.ai. Last updated on Jun 14, 2022.
# Make a config directory mkdir config # Copy the config.toml file to the new config directory. docker run--runtime = nvidia\ --pid = host\ --rm\ --init\ - u `id -u`: `id -g`\ - v `pwd` / config: /config \ --entrypoint bash\ h2oai / dai - ubi8 - x86_64: 1.10 .3 .1 - cuda11 .2 .2.xx - c "cp /etc/dai/config.toml /config"
docker run--runtime = nvidia\
--pid = host\
--init\
--rm\
--shm - size = 256 m\ -
u `id -u`: `id -g`\ -
p 12345: 12345\ -
e DRIVERLESS_AI_CONFIG_FILE = "/config/config.toml"\ -
v `pwd` / config: /config \ -
v `pwd` / data: /data \ -
v `pwd` / log: /log \ -
v `pwd` / license: /license \ -
v `pwd` / tmp: /tmp \
h2oai / dai - ubi8 - x86_64: 1.10 .3 .1 - cuda11 .2 .2.xx
export DRIVERLESS_AI_CONFIG_FILE = “/config/config.toml”
2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630