Age | Commit message (Collapse) | Author |
|
commit 92fd0ce4f659d7b0680543e9e5b96a3c7737a5f3
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Fri Aug 31 11:16:53 2012 +0200
intel: properly test for HAS_LLC
missed slightly and in effect had no effect on the outcome of checking
whether the kernel/chipset supported LLC.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
It's the same situation as flink and we need take the same precautions.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
|
|
If the kernel supports the test, we need to check the param.
Copy&pasta from the above checks that only look at the return value.
Interesting how much one can get such a simple interface wrong.
Issue created in
commit 151cdcfe685ee280a4344dfc40e6087d74a5590f
Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
Date: Tue Jan 17 15:20:19 2012 -0200
intel: query for LLC support
Patch even claims to have fixed this in v2, but is actually unchanged
from v1.
Reported-by: Xiang, Haihao <haihao.xiang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Otherwise pad appears uninitialized and valgrind grumbles.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
|
|
Otherwise we end up with X hitting a fail-loop as the embedded libGL
stacks asserts whilst initialising.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
|
|
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
This adds interfaces for the X driver to use to create a
prime handle from a buffer, and create a bo from a handle.
v2: use Chris's suggested naming (well from at least for consistency)
v3: git commit --amend fail
v4: fix as per Chris's suggestions, group assignments, add get tiling
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Since there is no getparam for hardware context support, Mesa always
tries to obtain a context by calling drm_intel_gem_context_create and
NULL-checking the result. On an older kernel without context support,
this caused libdrm to print an unwanted message to stderr:
DRM_IOCTL_I915_GEM_CONTEXT_CREATE failed: Invalid argument
In fact, this caused every Piglit test to fail with a "warn" status due
to the unrecognized error message.
Change the message to use DBG() rather than fprintf(), so people can
still get the debug message, but it won't spam normally.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
Add relevant code to set up minimal state and call the appropriate
kernel IOCTLs.
This was missed in the previous cherry-picking for 2.3.36.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
|
|
To support this we extract the common execbuf2 functionality to be
called with, or without contexts.
The context'd execbuf does not support some of the dri1 stuff.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
int drm_intel_gem_bo_wait(drm_intel_bo *bo, uint64_t timeout_ns)
This should bump the libdrm version. We're waiting for context support
so we can do both features in one bump.
v2: don't return remaining timeout amount
use get param and fallback for older kernels
v3: only doing getparam at init
prototypes now have a signed input value
v4: update comments
fall back to correct polling behavior with new userspace and old kernel
v5: since the drmIoctl patch was not well received, return appropriate
values in this function instead. As Daniel pointed out, the polling
case (timeout == 0) should also return -ETIME.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This patch adds a new function,
drm_intel_bufmgr_gem_set_aub_annotations(), which can be used to
annotate the type and subtype of data stored in various sections of
each buffer. This data is used to populate type and subtype fields
when generating the .aub file, which improves the ability of later
debugging tools to analyze the contents of the .aub file.
If drm_intel_bufmgr_gem_set_aub_annotations() is not called, then we
fall back to the old set of annotations (annotate the portion of the
batchbuffer that is executed as AUB_TRACE_TYPE_BATCH, and everything
else as AUB_TRACE_TYPE_NOTYPE).
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
These are more cases where valgrind doesn't understand what gets read
or written by our ioctls.
|
|
This improves the performance of Mesa's GL_MAP_UNSYNCHRONIZED_BIT path
in GL_ARB_map_buffer_range. Improves Unigine Tropics performance at
1024x768 by 2.30482% +/- 0.0492146% (n=61)
v2: Fix comment grammar.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
drmIoctl returns -1 on error with errno set to the error value. Other
users of it in this file just check for != 0, and only use errno when
they need to send an error value on to the caller of the API.
|
|
We've been hacking these constantly.
|
|
This will allow the driver to capture all of its execution state to a
file for later debugging. intel_gpu_dump is limited in that it only
captures batchbuffers, and Mesa's captures, while more complete, still
capture only a portion of the state involved in execution.
This is a squash commit of a long series of hacking as we tried to get
the resulting traces to work in the internal simulator. It contains
contributions by Yuanhan Liu and Kenneth Graunke.
v2: Drop the MI_FLUSH_ENABLE setup.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
For example:
export INTEL_DEVID_OVERRIDE=0x162
If this variable is set, don't actually submit the batchbuffer to the
GPU, it probably contains commands for the wrong generation of hardware.
v2: Introduce a getter for the overridden devid, and avoid getenv per exec.
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
Every access to either the GTT or CPU pointer is supposed to be
proceeded by a set_domain ioctl so that GEM is able to manage the cache
domains correctly and for the following access to be coherent. Of
course, some people explicitly want incoherent, non-blocking access
which is going to trigger warnings by this patch but are probably better
served by explicit suppression.
v2: Also mark the pointers as inaccessible following the explicit unmap
and implicit unmap upon return to the cache.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
In particular, declare the hidden CPU mmaps to valgrind so that it knows
about those memory regions.
v2: Add an additional VG_CLEAR for the getparam
References: https://bugs.freedesktop.org/show_bug.cgi?id=35071
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
[anholt: Ideally valgrind should just learn about the ioctls, and
removing the clear for the non-valgrindified code feels risky.]
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
This adds support for querying the kernel about the LLC support in the
hardware.
In case the ioctl fails, we assume that it is present on GEN6 and GEN7.
v2: fix the return code checking
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
|
|
If the pci_device's actual gen was > 4, then we stupidly set
bufmgr_gem->gen = 6. Luckily this caused no bugs, and this fix shouldn't
change any behavior, because all checks against the gen currently have one
of the forms below:
gen == 2
gen == 3
gen >= 4
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
This will make these macros reusable from intel_decode.c, which
doesn't have a bufmgr_gem context, without faking the struct. We
should generally only be using these macros from bufmgr_gem context
setup anyway.
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Eugeni Dodonov <eugeni@dodonov.net>
|
|
During free we unconditionally delete the bo from the vma cache. This
relies on the its list member being kept in a sane state. This fails
after the object is purged, as the purge operation performs a pure
deletion and doesn't reset the list member, leaving a pair of dangling
pointers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Hopefully all the bugs in the callers have been found, so time to
handle the failures "gracefully" again.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As the max number of VMA mappings is a hard per-process limit, we need
to include the number of currently active mappings when evicting in
order to make room for a new mmap.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
There is a per-process limit on the number of vma that the process can
keep open, so we cannot keep an unlimited cache of unused vma's (besides
keeping track of all those vma in the kernel adds considerable overhead).
However, in order to work around inefficiencies in the kernel it is
beneficial to reuse the vma, so keep a MRU cache of vma.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As a precautionary measure munmap on buffer free so that we never leak
the vma. Also include a warning during debugging.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We cannot afford to cache the vma per open bo as this may exhaust the
per-process limits.
References: https://bugs.freedesktop.org/show_bug.cgi?id=43075
References: https://bugs.freedesktop.org/show_bug.cgi?id=40066
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Otherwise we blow up on heavy tiled blitter loads (with giant
pixmaps).
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Before this, consumers of the libdrm API that might map a buffer
either way had to track which way was chosen at map time to call the
appropriate unmap. This relaxes that requirement by making
drm_intel_bo_unmap() always appropriate.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This used to be next to some map refcounting code, but that is long dead.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This lets us replace the current inner drawing loop of mesa:
for each prim {
compute bo list
if (check_aperture_space(bo list)) {
batch_flush()
compute bo list
if (check_aperture_space(bo list)) {
whine_about_batch_size()
fall back;
}
}
upload state to BOs
}
with this inner loop:
for each prim {
retry:
upload state to BOs
if (check_aperture_space(batch)) {
if (!retried) {
reset_to_last_prim()
batch_flush()
} else {
if (batch_flush())
whine_about_batch_size()
goto retry;
}
}
}
This avoids having to implement code to walk over certain sets of GL
state twice (the "compute bo list" step). While it's not a
performance improvement, it's a significant win in code complexity:
about -200 lines, and one place to make mistakes related to aperture
space instead of N places to forget some BO we should have included.
Note how if we do a reset in the new loop , we immediately flush. We
don't need to check aperture space -- the kernel will tell us if we
actually ran out of aperture or not. And if we did run out of
aperture, it's because either the single prim was too big, or because
check_aperture was wrong at the point of setting up the last
primitive.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
A few of the bitfield-based booleans are left in place. Changing them
to "bool" results in the same code size, so I'm erring on the side of
not changing things.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Luckily the kernel has become extremely paranoid about such matters.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Otherwise it's pretty hard to differentiate the different chipset
variants.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
A tile on gen2 has a size of 2kb, stride of 128 bytes and 16 rows.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This is Fail.
First patch to libdrm, and I've borked it up.
Noticed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
... and if asked to open a bo by the same global name, return a fresh
reference to the previously allocated buffer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
gen4+ hardware doesn't use fences for GPU access and the older kernel
doesn't expect userspace to make such a mistake. So don't.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32190
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
... but only account for a fenced used if the object is tiled.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For relaxed fencing the object may only consume the small set of active
pages, but still requires a fence region once bound into the aperture.
This is the size we need to use when computing the maximum possible
aperture space that could be used by a single batchbuffer and so avoid
hitting ENOSPC.
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Both the consumers of this API (sync objects and client throttling)
were expecting this behavior. The kernel used to actually behave the
desired (but incorrect) way for us anyway, but that got fixed a while
back.
|